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GLOSSARY  OF  SYMBOLS 


The  meanings  of  the  more  important  symbols  are  summarized 
here,  together  with  references  to  the  definitions  of  the  others.  The 
symbols  are  listed  with  respect  to  the  chapters  in  which  they  appear. 


Chapter  2 
{A.B.Cj} 


a 


c. 

1 


VV 


E 


[F.G.H} 


X(s) 

M 


N 

0(0 

P(t) 


P.(t) 


Realization  of  the  linear  dynamical  system  with  state 
x(t),  input  w(t)  and  output  (observation)  y.. 

A  constant,  see  (2.  3). 

(For  i  =  1, 2,  3)  constants,  see  (2.41),  (2.43). 

Delay  on  the  link  (i,j)  when  the  intermessage  frequ  ,.cy 
is  fy. 

Expected  value  of. 

Realization  with  state  variable  z(t)  and  related  to  the 
realization  fA,B,C}  by  the  similarity  transformation  M. 

Intermessage  frequency  on  link  (i.j). 

Input-output  transfer  function. 

Invertible  matrix  representing  the  linear  transformation 
between  x(t)  and  z(t). 

Number  of  sensor  nodes  in  the  network. 

Set  of  nodes  #j  for  which  the  links  (i.j)  exist  from  node  # 

Error  covariance  matrix  associated  with  the  optimal 
(minimum  mean-square  error)  estimation  of  x(t). 

Local  error  covariance  matrix  at  node  #i. 


Steady-state  value  of  P.(t). 
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p(t) 

Pi 


Q 

q 

R. 

i 

r 

re 

S(t) 


3ij(,) 


s,. 

ijo 

t 

V(t) 


Vjlt) 

w(t) 

x(t) 


x(t) 


Mean-square  estimation  error  of  x(t). 

(For  i  =  1,2,3)  Constants  defined  by  (2.  5),  (2.  9)  and 

(2.  10). 

Intensity  of  w(t). 

Intensity  of  w(t),  for  the  scalar  case. 

Intensity  of  v.(t). 

Intensity  of  v(t),  for  the  scalar  case. 

Constant,  defined  by  (2.  16). 

Error  covariance  matrix  associated  with  the  optimal 
prediction  of  x(t). 

S(t  =  0). 

Independent  time  variable. 

(i,  j)th  entry  of  S(t). 

(i,  j)th  entry  of  SQ. 

Independent  time  variable. 

Error  covariance  matrix  associated  with  the  optimal 
prediction  of  z(t). 

V(t  =  0). 

Observation  noise  vector  at  sensor  node  #i. 

Process  noise  vector. 

State  variable  vector  of  the  linear  dynamical  system 
under  observation. 

Optimal  (minimum  mean-square  error)  estimate  of  x(t). 


h 


r 


9 


9 


9 


1 


9 


Measurement  vector  at  sensor  node  #i. 

State  variable  vector  related  to  x(t)  by  the  linear  trans¬ 
formation  M. 

Constants,  see  (2.30). 

Constant,  see  (2.  8). 

Constant,  see  (2.  17). 

Constants,  see  (2.  30). 

Constants,  see  (2.32). 

Constants,  see  (2.  30). 

Scalar  time  function,  see  (2.41). 

§<t  =  0). 

Constants,  see  (2.39). 

Communications  link  directed  from  node  #i  to  node  #j. 
Sensor  node  ^i. 

Same  as  in  Chapter  2. 

Expected  value  of. 

(For  i,  j  =  1,2)  A  correction  term  similar  to  r.(t),  but 
one  which  takes  into  consideration  that  the  observations 
of  node  i  were  not  available  since  the  arrival  of  the 
last  message  from  node  #j. 

A  correction  term  similar  to  r.(t),  but  one  which  takes 
into  consideration  that  the  observations  of  node  #i  were 
not  available  since  the  arrival  of  the  last  message  from 
node  H. 


x> 


P(t) 


Global  error  covariance  matrix  associated  with  x(t). 


P^(t)  Local  error  covariance  matrix  associated  with  x.(t). 

Rj  Same  as  in  Chapter  2. 

r.(t)  A  term  needed  to  correct  correlation  of  x.  (t)  and  x9(t) 

1  A  L  £ 

due  to  x(0)  and  w(t),  whose  statistical  specifications 
are  common  knowledge  to  both  sensor  nodes. 

Q  Same  as  in  Chapter  2. 

S.(t)  Global  error  covariance  matrix  associated  with  x.(t). 

s  Independent  time  variable. 


t  Independent  time  variable. 

t^  Time  instant  the  j**1  message  is  sent  from  both  sensor 

nodes  to  the  destination  when  reporting  times  are 
synchronized. 

tj  Time  instant  the  j  message  is  sent  from  node  #i  to 

the  destination  when  reporting  times  are  not  synchronized 

* 

t|  See  Fig.  3.2. 

tj  See  Fig.  3.2. 

Vj(t)  Same  as  in  Chapter  2. 

w(t)  Same  as  in  Chapter  2. 

x(t)  Same  as  in  Chapter  2. 

x(t)  Global  optimal  estimate  of  x(t). 

^(t)  Local  optimal  estimate  of  x(t)  at  node  #i. 

Xj(t)  (For  i,j  =  1,2)  Global  optimal  estimate  of  x(t),  but  one 

which  takes  into  consideration  that  the  observations  of 
node  #j  #  i  were  not  available  since  the  arrival  from 
node  #j. 


.'r 
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x.(t)  Local  optimal  estimate  of  x(t)  at  node  #i,  but  one  which 

takes  into  consideration  that  the  observations  of  node  # i 
were  not  available  since  the  arrival  of  the  last 
message  from  node  #i. 

yj(t)  Same  as  in  Chapter  2. 

Zj( t)  Local  error  covariance  matrix  associated  with  Xj(t). 


Chapter  4 


{A.B.C.} 


A.B.C 


Same  as  in  Chapter  2. 

(In  Sections  4.  4.  2  and  4.  4.  3)  Sets  of  paths  j  from  the 
sensor  node  to  the  destination  node,  with  associated 
average  intermessage  periods  and  corresponding 
delays  d^. 

Set  containing  the  same  paths  j  as  A,  but  with  T3  =  T-*  . 
Constant,  see  (4.6). 

The  value  of  the  argument  of  for  p*.  See  (4. 13). 

The  value  of  the  argument  of  for  p  .  See  (4.  26). 

m  wc 

Ci  e  B  :  i  e  V} 

(isB:UV} 

Constant,  see  (4.6). 

The  value  of  the  second  argument  of  rj  for  p*.  See  (4.  13). 

m 

The  value  of  the  second  argument  of  V  for  p 

m  wc 

See  (4.  26). 

Channel  capacity  of  path  j. 


Channel  capacity  of  link  (i,  j). 


Vector  of  C  's. 


C  " 

C*  element  of  C_  * 

c  (In  Section  4. 2)  Unit  of  time:  c  ^  Tj/kj  =  Tg/kg. 

where  k^  and  k2  are  mutually  prime  integers. 

D  (In  Section  4.  4)  Effective  (information)  delay  from  the 

sensor  node  for  the  optimum  adjustment  of  the 
departure  times  on  the  paths  j  to  the  destination,  given 
constant  values  of  TJ  and  dJ. 

(In  Section  4.  4)  Optimal  value  of  D,  minimized  with 
respect  to  average  departure  frequencies  and  departure 
times  on  the  paths  to  the  destination. 

D(A)  Term  defined  by  (4.58). 

C(A)  (In  Section  4.  4)  Effective  (information)  delay  from  the 

sensor  node  for  the  optimum  adjustment  of  departure 
times  on  the  paths  which  are  elements  of  set  A. 

D(A  -  f  k} )  Term  defined  by  (4. 104). 

Dj  (In  Section  4.4.4)  Effective  (information)  delay  from 

node  #i  to  the  destination  at  each  local  minimum  of  p  , 

W  C 

reflecting  optimal  adjustment  of  the  departure  times  on 
the  links  originating  from  node  #i,  for  given  average 
departure  frequencies  and  corresponding  communication 
delays. 

(In  Section  4.  2)  Delay  on  the  link  from  node  #i  to  the 
destination  node  when  the  intermessage  period  is  Tj. 

Delay  on  link  (i,  j)  when  the  Intermessage  period  is  T^. 

(In  Section  4.4)  Delay  on  path  j  from  the  sensor  node  to 
the  destination  when  the  reciprocal  of  the  average  Inter- 
message  frequency  is  T^. 


S’-,  (In  Section  4.4)  Weighted  average  of  delays  on  path  3 

D 

with  3  e  B. 


(In  Section  4.  4)  Effective  communication  delay  from 
the  sensor  node  to  the  destination. 

Average  intermessage  frequency  on  path  3. 

■  * 

f^  Optimum  value  of  f^. 

G  A  set  of  sets  of  paths,  defined  by  (4.  83). 

H^T1,  •••,  Tk_1,  Tk+1,  •••,  TL):  A  hyperplane  in  lR1""1,  defined  by 
(4.  94). 

u  th 

I*  Set  of  indices,  defined  by  (4. 118),  for  the  k  iteration  of 

the  algorithm  of  Proposition  4.7. 


Interval  A  (0,  A  ). 

w  c 

Interval  B  [Awc,  c  ** 


L 


(In  Section  4.  4)  Number  of  paths  from  the  sensor  node 
to  the  destination. 


mk 


O(i) 

P(t) 

F 

Pm(t) 

m 


The  first  non-negative  integer  m  such  that  (4. 133)  is 

|L 

satisfied,  for  the  k  iteration  of  the  algorithm  of 
Proposition  4.  7. 

Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Steady-state  value  of  P(t). 

(In  Section  4.  3)  Error  covariance  matrix  associated 
with  the  optimal  estimation  of  x(t)  with  nodes 
#m,  #m+l,  •••,  #N  reporting. 


p(t) 


Same  as  in  Chapter  2. 


i  peak  in  a  certain  period  of  p(t). 


A  certain  peak  of  p(t)  given  by  (4.  5). 

(For  i  =  1,2)  Candidate  peaks  for  the  highest  peak  for 
A  e  Interval  A. 

(For  I  =  1,2)  Candidate  peaks  for  the  highest  peak  for 
A  e  Interval  B. 


pQp^  (In  Section  4. 2)  Minimum  value  of  the  highest  peak  of 

p(t)  with  respect  to  A  for  given  TJf  d  ,  T2  and  d2 
values. 

p  Maximum  value  in  time  of  p(t)  for  the  worst  possible 

w  o 

timing  relationship  between  the  reporting  times  of  the 
nodes. 

Q  Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

]Rm  m-dimensional  Euclidean  space. 

S(t)  (In  Section  4.  3)  Same  as  in  Chapter  2. 

S  (t.,t9)  (In  Section  4.2)  Error  covariance  matrix  resulting  from 

prediction  during  the  last  t2  seconds,  after  getting 
observations  from  node  #m  for  tj  seconds. 

s^  Scalar  sequences  defined  by  (4. 120). 

(In  Section  4.  3)  Spanning  tree  linking  the  sensor  nodes 
to  the  destination  node. 


(In  Section  4.2)  The  intermessage  period  from  sensor 
node  #i  to  the  destination  node. 


argmin  T.  +  d.  (T,). 


Intermessage  period  on  link  (i,j). 

argmin  T..  +  d  (T  ). 

T  >  0  1J 

ij 

(In  Section  4.  4)  The  reciprocal  of  the  average  inter¬ 
message  frequency  on  path  j. 

The  optimal  values  of  T^  which  minimize  D. 

(In  Section  4. 4)  Effective  period,  or  the  inverse  of  the 
sum  of  the  average  departure  frequencies  on  the  paths 
from  the  sensor  node  to  the  destination  node. 

Vector  of  T^'s. 

f  U  1L 

Vector  of  coordinates  T^  with  i  e  l£,  for  the  k  iteration 
of  the  algorithm  of  Proposition  4. 7. 

Vector  of  coordinates  T^  with  i  i  I*,  for  the  k  iteration 
of  the  algorithm  of  Proposition  4.  7. 

ith  element  of  T,  for  the  kth  iteration  of  the  algorithm  of 
Proposition  4.  7. 

Independent  time  variable. 

Defined  by  pwc  =  p(twc>. 

Time  of  departure  of  the  ith  message  from  sensor 
node  #n. 

Time  of  arrival  of  the  Ith  message  from  sensor  node  #n. 

(For  i  =  1,2)  Time  duration  defined  by  (4.  37),  (4.  38). 

—A 

(For  i  =  1,2)  Defined  similarly  to  t  .  for  a  particular 

SLl 

e  Interval  B. 

(In  Section  4.4)  Time  of  departure  of  the  j**1  message 
from  the  sensor  node. 


18 


(In  Section  4.4)  Time  of  arrival  of  the  jth  message  at 
the  destination  node. 

Upper  limit  on  T^,  imposed  for  optimization  purposes. 
Vector  of  U^'s. 
i**1  element  of  IJ. 

Time  instant  defined  by  (4.  7). 

(In  Section  4.4)  Largest  set  of  "useful  paths'  each  of 
which  contribute  to  the  minimization  of  D. 

[j  e  A  :  j  i  V}  in  the  proof  of  Proposition  4.  5. 

(In  Section  4.4)  i*^  element  of  set  V. 

Same  as  in  Chapter  2. 

(In  Section  4.4)  Sets  of  paths  defined  by  (4.69),  (4.70). 
Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Set  of  paths  defined  by  (4.59). 

Same  as  in  Chapter  2. 

A  subset  of  fl,  2,  •  •  • ,  L} . 

The  sequence  defined  by  (4. 124). 


A  constant  out  of  the  interval  (0, 1). 

A  constant  out  of  the  interval  (0,  l),  for  the  kth  iteration 
of  the  algorithm  of  Proposition  4.  7. 

Scalar  sequences  defined  by  (4. 130). 
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m 


wc 


sup 


kB 


fi(j) 


e 


c 


Scalar  sequences  defined  by  (4. 129). 

(Section  4.2)  Phase  difference  between  the  reporting 
sequences  of  the  two  sensor  nodes. 

(Section  4.2)  Value  of  A  which  minimizes  the  highest 
peak  of  p(t)  for  T*.  d*,  Tg,  dg. 

(Section  4.2)  Value  of  A  which  maximizes  the  highest 
peak  of  p(t)  for  given  dJ#  T dg. 

Upper  bound  for  A,  given  by  (4.22). 

A  constant  out  of  Interval  A. 

A  constant  out  of  Interval  B. 

(In  Section  4.2)  Time  difference  between  the  departure 
of  message  #j  from  node  #2  and  the  departure  of  the  most 
recent  message  from  node  #1. 

A  positive  constant  scalar. 

A  scalar  defined  by  (4. 119). 

A  constant  out  of  the  interval  (0,  $). 


WV 


Trace  S  (t. ,  t„). 
m  l  i 

A  positive  definite  symmetric  matrix,  for  the  k**1  iteration 
of  the  algorithm  of  Proposition  4.  7. 


X(t) 


i 


*k 


ft 


(Section  4.  3)  Trace  S(t). 

A  positive  constant  scalar. 

Scalar  sequences  satisfying  (4. 121). 

Search  direction  for  the  k**1  iteration  of  the  algorithm  of 
Proposition  4.  7. 

i  # 

Vector  of  coordinates  with  i  e  l£. 
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D.  (In  Section  5.3)  (For  i  =  1,2)  Aggregate  delay  on 

link  (i,  3). 

d.  (In  Section  5.  3)  (For  i  =  1,2)  Communication  delay 

on  link  (i,  3). 

3”  (In  Section  5.  2)  Random  variable  denoting  the  delay 

incurred  by  a  message  sent  from  the  sensor  node  to  the 
destination  node. 

(In  Section  5.  2)  Random  variable  denoting  the  delay 
incurred  by  the  message  sent  at  t  =  t  +  kT. 

E  Expected  value  of. 


k  (In  Section  5.2)  Scalar  factor  defined  by  (5.  8). 

k.j  Constants,  see  (5.  12)  and  (5. 13). 

m^T)  Expected  value  of  d  as  a  function  of  T. 

P(t)  Error  covariance  matrix  associated  with  the  optimal 

prediction  of  x(t). 

p(t)  Same  as  in  Chapter  2. 

p  Prob  ("n-tuple"  at  t  ). 

n  o 

jje 

p  (In  Section  5.  3)  Minimum  mean-square  error. 


Q 


Same  as  in  Chapter  2. 


Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Constant  defined  by  (5. 14). 

Steady-state  error  covariance  matrix  associated  with 
the  optimal  estimation  of  x(t). 


(In  Section  5.2)  Intermessage  period  from  the  sensor 
node  to  the  destination  node. 


(In  Section  5.3)  (For  i  =  1,2)  Intermessage  period 
from  sensor  node  #i  to  the  destination  node. 

Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Same  as  in  Chapter  2. 

Constant,  defined  by  (5. 14). 

Constant,  defined  by  (5. 14). 

(In  Section  5.  2)  Random  variable  for  the  delay  of 
information  (message  from  other  node  or  observations) 
from  the  time  it  arrives  at  an  intermediate  node  to  the 
time  it  arrives  at  the  destination  node. 

Same  as  in  Chapter  2. 


For  all. 

There  exlst(s). 
Union. 

Intersection. 
Element  of. 

Not  an  element  of. 


Subset  of. 
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CHAPTER  1 
INTRODUCTION 

Recently  there  has  been  a  significant  amount  of  work  on  the 
subject  of  decentralized  estimation.  The  various  approaches  can  be 
divided  into  two  classes.  The  first  clrss  consists  of  methods  which  use 
the  distributed  structure  of  the  problem  in  such  a  way  as  to  achieve  an 
overall  estimator  whose  error  corresponds  to  that  of  a  fully  centralized 
estimator  [1],  [2J.  The  second  class  of  approaches  consists  of  utilizing 
a  fixed  structure  to  achieve  the  best  performance  possible  with  this 
restricted  structure  [3],  [4]. 

The  proposed  thesis  research  can  be  categorized  into  the  second 
class  of  problems.  It  incorporates  time  delay  constraints  on  information 
flow,  a  characteristic  which  was  not  addressed  in  the  above  work. 

While  the  research  does  not  intend  to  extend  the  above  work  with  added 
delay  constraints,  it  aims  to  address  certain  decentralized  estimation 
problems  with  an  underlying  delay  formulation.  The  structure  that  is 
considered  here  is  a  network  of  processors  which  also  have  sensors  for 
taking  measurements.  Various  types  of  delays,  such  as  queueing, 
processing  and  propagation  between  two  nodes,  are  aggregated  into  a 
formulation  which  is  a  function  of  the  traffic  rate  between  the  nodes. 

In  the  network  considered,  state  estimates  are  desired  at  a 
destination  node.  With  no  delays,  getting  observations  from  all  sensors 
as  often  as  possible  would  give  the  best  performance,  since  the  observa¬ 
tion  noise  processes  of  the  sensors  are  assumed  to  be  independent. 
However,  for  a  formulation  where  the  delays  are  monotonically  increasing 
functions  of  message  traffic,  sending  observations  at  a  high  rate  would 
cause  large  delays.  Since  there  is  statistical  uncertainty  in  the  evolution 


of  the  system  under  observations,  delays  cause  a  degration  of  estimation 
performance.  Therefore,  there  seems  to  be  an  optimum  rate  of 
information  sent  from  the  sensors  to  the  destination. 

Also  taking  into  account  that  some  sensors  provide  more  informa¬ 
tion  about  the  states  observed  (with  respect  to  a  measure  like  the  signal 
to  noise  ratio),  regulating  the  amount  of  information  sent  from  each  node 
is  one  of  several  ancillary  issues.  Routing  to  minimize  delays  directly 
and  data  compression  (by  combining  observations  or  local  estimates)  to 
reduce  traffic,  thus  minimizing  delay  indirectly,  are  the  other  issues 
addressed.  Problem  formulations  of  routing  and  flow  control  in 
computer  communication  networks  assume  given  demand  statistics  for 
each  origin-destination  pair  [5].  However,  in  this  research,  the  input 
rate  at  each  node  will  be  a  control  variable  also.  Thus  this  work  can 
also  be  viewed  as  the  development  of  a  flow  control  scheme  for  one 
particular  class  of  network  problems,  namely  decentralized  linear 
estimation. 
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CHAPTER  2 

MODELS  AND  OBJECTIVE  FUNCTIONS 

2. 1  Introduction 

In  this  chapter  we  will  present  the  components  of  the  general 
problem  formulation.  This  will  include  the  specification  of  the  network 
structure,  the  class  of  systems  under  observation,  the  observation 
model  for  the  sensor  nodes,  the  content  of  messages  sent  between  the 
nodes,  the  models  for  the  delays  incurred  by  these  messages  on  the 
communication  links  and,  finally,  the  type  of  objective  functions 
considered. 

2.2  The  Network  Structure 

The  network  is  composed  of  N  sensor  nodes  and  a  destination 
node,  where  the  state  estimate  of  the  observed  system  is  desired.  Each 
node  #i  is  directly  connected  to  a  set  of  neighboring  nodes,  0(1),  through 
directed  links  (i,  j),  j  e  0(i).  The  links  have  distortionless  communication 
channels  with  finite  capacity. 

2.  3  The  System  and  the  Observations 

The  system  under  observation  is  modelled  as  a  linear  dynamical 
time-invariant  stochastic  system: 

dx(t)  =  Ax(t)dt  +  Bdw(t).  (2.1) 

Node  #i  makes  observations: 

dy.(t)  =  C.  x(t)dt  +  dv.(t).  (2.2) 

Initial  state  x(t  )  is  uncorrelated  with  w  and  v..  Also, 
o  i  ’ 

w,  V  V£,  ••*,  v^  are  Brownian  motion  processes  which  are  all 
uncorrelated.  These  assumptions  have  been  adopted  for  the  sake  of 
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simplicity;  extensions  are  possible  for  the  cases  of  correlated  and 
colored  noise  inputs  [6].  Wiener  process  w(t)  has  intensity  Q,  and 
v.(t)  have  intensity  Rj. 

The  system  and  observation  models  are  assumed  to  be  time- 
invariant.  (Alternatively,  we  might  assume  that  these  change  suffic¬ 
iently  slowly  so  as  not  to  interfere  with  the  convergence  of  optimization 
algorithms  discussed  in  this  thesis.) 

We  also  assume  that  Q  >  0,  R.  >  0,  i  =  1,  •  •  • ,  N;  and  that  the 
pair  fA.B}  is  stabilizable  and  the  pairs  fA,C.}  are  detectable  for 
i  =  1,  ••*,  N. 

For  the  purposes  of  this  thesis,  we  need  to  impose  some  further 
restrictions  on  the  dynamical  and  statistical  structure  of  the  system  and 
observation  models.  Before  being  more  specific,  we  present  some 
facts. 

Consider  the  scalar  linear  stochastic  dynamical  system; 

dx(t)  =  ax(t)dt  +  dw(t),  (2.3) 

with  the  scalar  observations 

dy(t)  =  x(t)dt  +  dv(t).  (2.4) 

w(t)  and  v(t)  are  independent  zero-mean  scalar  Brownian  motion 
processes,  with  intensities  q  and  r,  respectively,  which  are  strictly 
positive.  Also, 

E[x(0)]2  =  pQ.  (2.5) 

If  the  observations  are  processed  by  a  filter  which  minimizes  the 
mean-square  estimation  error  at  all  times  (e.  g.  a  Kalman  filter),  then 
p(t),  the  mean-square  estimation  error  at  time  t  is  given  by  the  Riccati 


differential  equation: 


t>(t)  =  2  ap(t)  +  q  -  (1/r)  p  (t),  p(0)  =  pQ.  (2.6) 


Since  this  equation  is  scalar,  it  can  readily  be  solved,  yielding 


17]: 


p(t)  =  p 


P1  +  p2 


1  l  +  [(P0  +  P2»<P1-P0))'l!St  ’ 


(2.7) 


where 


0  =  J  a2  +  (q/r)  , 

Pj  =  r(0  +  a)  , 

P2  =  r(0  -  a)  . 


(2.8) 

(2.9) 

(2. 1C 


Note  that  we  have  p(t)  *♦  Pj  as  t  -»  ®. 

Lemma  2. 1:  If  p^  >  pQ,  then  f>(t)  >  0  for  all  t  >  0;  i.  e.  the  mean 
square  estimation  error  is  a  monotonically  Increasing  function  of  time. 
Proof:  From  (2.  7), 


f>(t) 
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Pl  +  P2 

{l  +  [(po+p2)/(p1-po)]e'<Jpt^ 


.  (!ol!2\e20t 

Vpl"po' 


(2.11. 


Since  all  terms  are  strictly  positive,  £>(t)  >  0  for  all  t  >  0. 

Q.  £.  D. 


Let  us  briefly  describe  an  example  situation  where  the  above 
result  is  relevant.  Assume  that  there  are  two  sensors  making 
observations: 


dyj(t)  =  x(t)dt  +  dVj(t)  (2.12) 

dy2(t)  =  x(t)dt  +  dv9(t)  (2.  13) 


of  the  system  (2.  3). 


Vj  and  Vg  are  independent  zero-mean  Brownian  motion  processes 
with  intensities  r^  and  rg.  respectively.  Assume  that  the  sensors  have 
been  making  observations  for  a  long  time,  so  that  the  optimal  filter 
processing  them  is  in  the  steady  state.  Then  the  steady-state  mean- 
square  estimation  error,  pQ,  is  given  by  the  algebraic  Riccati  equation 
[7]: 


o  =  2ap0  +  q  -  (i  +  J-)p2,  (2.14) 


with  the  solution 


where 


Po 


re  ^e  +  a)  ' 


(2. 15) 


(2.16) 


=  J*2  +  (q/re)  . 


(2.17) 


At  time  t  =  0,  assume  that  one  of  the  sensors,  say  the  second  one 
stops  making  observations.  In  this  case  the  propagation  in  time  of  the 
mean-square  error,  p(t),  is  given  by  (2.  7)  -  (2. 10),  with  r  replaced  by 
Tj.  As  t  -*  ®,  p(t)  approaches  the  steady-state  value  of  p^,  given  by 
(2.9).  It  is  easy  to  show  that  Pj  >  pQ.  Now  Lemma  2. 1  indicates  that 
the  transition  of  p(t)  from  pQ  to  Pj  is  monotonic. 

Now  consider  the  multidimensional  linear  stochastic  dynamical 
system: 

dz(t)  =  Fz(t)dt  +  Gdw(t)  (2.  18) 

w(t)  is  a  zero-mean  Brownian  motion  process  with  intensity  Q, 


which  is  positive  definite.  Also, 


where  VQ  satisfies  the  algebraic  matrix  Riccati  equation: 

0  =  FV  +  V  F'  +  GQG'  -  V  H'  R-1  HV  .  (2.20) 

o  o  o  o 

Hence  we  assume  that  there  is  an  initial  estimate  of  the  state  z  at 
time  t  =  0.  There  are  no  observations  for  t  >  0,  so  for  optimal 
prediction  which  minimizes  the  mean-square  prediction  error  at  all 
times,  the  covariance  matrix  V  (t)  of  the  prediction  error  at  time  t  is 
given  by  the  linear  matrix  differential  equation  [7]: 

V(t)  =  FV  (t)  +  V  (t)F'  +  GQG*,  V(0)  =  VQ.  (2.21) 

jjc 

Lemma  2. 2:  Unless  matrix  F  is  defective,  there  exists  an 
invertible  linear  transformation: 

x(t)  =  Mz(t)  (2.22) 

such  that  the  error  covariance  matrix  S(t)  associated  with  the  optimal 
prediction  of  x(t)  is  such  that: 

[tr  S(t)]  >0  for  all  t  2  0  ,  (2.23) 

I.  e.  the  mean-square  estimation  error  of  x(t)  is  a  monotonically 
increasing  function  of  time. 

Proof:  From  (2.  18)  and  (2.22),  x(t)  satisfies  the  equation: 


dx(t) 


Ax(t)dt  +  Bdw(t)  , 


(2.24) 


where 


A  =  MFM"1  ,  (2.25) 

B  =  MG  .  (2.26) 

Therefore  (2.21)  can  be  written  as: 

M^(t)M'  =  A  MV  (t)M’  +  MV  (t)M’A'  +  BQB*  . 

or  as 

S(t)  =  AS(t)  +  S(t)A'  +  BQB'  . 

where 

S(t)  =  E  |[x(t)  -  x(t)]  (x(t)  -  x  (t)]'  } 

=  E  jjMz(t)  -  Mz(t)]  [Mz(t)  -  Mz(t))'| 

=  M  (e  {[z(t)  -  z(t)]  [ z(t)  -  z(t)]'j  ) M' 

=  MV(t)M'  .  (2.29) 

Now,  unless  matrix  F  Is  defective,  matrix  A  =  MFM-1  can  be  put 
into  the  form  shown  in  Eq.  (2.  30)  (see  next  page)  by  an  appropriate 
choice  of  the  transform  matrix  M  [9], 


(2.27) 

(2.28) 


where  0J,  9,.  •••.  9k,  «2  i  i(*2.  •••,  <*m 

values  of  matrix  F. 

Defining  the  entries  of  S(t)  and  BQB*  as: 

[S(t)]i  ;j  =  Sylt). 

.  yn. 


by  (2.30),  the  components  of  (2.28)  can  be  written  as: 
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skk(t)  =  29kskk«>  +  vkk . 

S  (t)  +  s  (t)  =  2a.  [s  (t)  +  S  (t)  1 

(k*l)(k+l)  (k+2)(k+2)  1  (k+l)(k+l)  (k+2)(k+2) 

+  Iy(k+l)(k+l)  +  y(k+2)(k+2)^  ’ 

s  (t)  +  s  (t)  =  2a  rs  (t) 

(k+2m-l)(k+2m-l)  (k+2m)(k+2m)  m  (k+2m-l)(k+2m-l) 

+  s  (t)  ] 

(k+2m)(k+2m) 

+  ^(k+2m-l)(k+2m-l)  +  y(k+2m)(k+2m)^  * 

(2.33) 

Defining: 

So  =  S(°>  =  MVqM’  ,  (2.34) 

by  (2.25),  (2.20)  can  be  written  as: 

0  =  AMVoM»  +  MVqM»A  +  MGQG'M’ 

-  MV  M'  C'  R"1  CMV  M»  , 
o  o 

or  as: 

0  =  AS  +  S  A'  +  BQB'  -  s  C'R^CS  , 

°  O  o  o 

C  =  HM"1. 


(2.35) 

(2.36) 

(2.37) 


1 


’«  ,*>’**  -S  I,'*1  A  fc*«  vS.**  «'•  .'"W**  »**  b  *»**•«•"  >  *  .  •  b.  •  -  .  •  •  '  J 


where: 
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(2.38) 

(2.39) 


0 

0 


20,  s,  , 
k  kko 


2al(s(k+l)(k+i)o  +  s(k+2)(k+2)o) 

+  <y(k+l)(k+l)  +  y(k+2)(k+2)}  "  (a(k+l)(k+l)  + 

a(k+2)(k+2)}  ' 


0  =  2am(s(k+2m-l)(k+2m-l)o  +  s(k+2m)(k+2m)o) 

^y(k+2m-l)(k+2m-l)  +  y(k+2m)(k+2m)^ 

-  (ff(k+2m-l)(k+2m-l)  +  a(k+2m)(k+2m)) 

(2.40) 

Equations  In  (2.  33)  and  (2. 40)  all  have  the  form: 

let)  =  CjSft)  +c2>  c2  >  0  ,  (2.41) 

S(0)  =  lQ  >  0  ,  (2.42) 


0 


cl5o  +  °2  -  c3;  °3 >  0  • 


(2.43) 


which  has  the  solution: 


Then: 


Clt  C2  C1l 

?(t)  =  sn  e  +  — =■  (e  1  -  1),  t  *  0  . 

o  Cj 

c  1 1 

!(t)  =  (c1?o  +  c2)e  1 

cit 

-  c3e  1  >  0,  t  *  0 


(2.44 


(2.45 


by  (2.  43). 

Hence: 

Su(t)  >0.  I  »  1.2,  •••,  k  .  t  *  0.  (2.46 


Sjjd)  +  >  °»  i  =  k+l,  k+3,*».#  k  +  2m-l, 

t  *  0,  (2.47 

and 

tr  S(t)  *  ^-E  {[x(t>  -  x(t))'  [x(t)  -  x(t)]} 
k+2m 

=  Y  s..(t)  >  0  for  t  *  0  .  (2.48 

t-t  n 

i=l 

Q.  E,  D. 

Now  we  will  bring  some  clarification  and  justification  to  the 
hypotheses  of  the  above  result. 

Consider  the  system  described  by  (2. 1)  and  (2.  2)  for  one  sensor 
only,  with  y^  =  y,  =  v,  c^  =  c,  Rj  =  R.  Also  consider  the  system 
resulting  from  the  transformation  (2.22): 

dx(t)  =  Ax(t)dt  +  Bdw(t)  ,  (2.4S 


dy(t) 


Cx(t)dt  +  dv(t)  , 


(2.5C 


where  A,  B  and  C  are  given  by  (2,25),  (2.  26)  and  (2.  37). 

It  is  easy  to  show  that  the  systems  (2.  1)  -  (2.2)  and  (2.  49)  -  (2.  50) 
have  the  same  input-output  transfer  function His),  that  is: 

K(s)  =  C(sl-A)"1  B  +  I  +  H(sI-F)_1G  +  I  for  all  s 

(2.51) 

In  other  words,  given  the  transfer  function,  a  realization  of  the  system  in 
state-space  form  is  unique,  modulo  a  similarity  transformation.  There¬ 
fore  we  wish  to  argue  that  an  invertible  transformation  of  state  variable 
coordinates  is  not  a  serious  compromise  of  the  generality  of  the  formula¬ 
tion. 

Assume  that  the  sensor  has  been  making  observations  for  a  long 
time,  so  that  the  optimal  filter  processing  them  is  in  the  steady  state. 
Then  under  the  stabilizability  and  detectability  assumptions  for  the  pairs 
(F, G)  and  (F,H),  which  imply  the  same  conditions  for  the  pairs  (A,B) 
and  (A,  C)  [10],  the  steady-state  error  covariance  matrix  Sq  associated 
with  the  optimal  estimation  of  x(t)  is  given  by  the  algebraic  matrix 
Riccati  equation  (2.36)  [6], 

At  time  t  =  0,  assume  that  the  sensor  stops  making  observations. 

Then  the  propagation  in  time  of  the  error  covariance  matrix  S(t) 

associated  with  the  optimal  prediction  of  x(t)  for  t  >0  is  given  by  (2.28), 

with  the  initial  value  of  S(0)  =  S  .  Now  Lemma  2.  2  indicates  that 

o 

tr  S(t);  or  the  mean-square  prediction  error  of  x(t)  increases  monotonic- 
ally  with  time. 

In  view  of  the  preceding  discussion,  the  worst-case  optimization 
results  presented  in  Chapter  4  are  valid  with  the  following  restrictions 
imposed  on  the  dynamical  system  under  observation: 


1.  In  Sections  4.2  and  4.  3,  the  dynamical  system  is  assumed 


to  be  one-dimensional. 

2.  In  Sections  4.4.2  and  4.4.  3,  the  system  is  assumed  to  be 
multidimensional  with  the  structure  indicated  in  (2.  30). 

3.  In  Section  4.  4.  4,  there  is  no  need  to  impose  additional 
structure  on  the  system;  the  A  matrix  can  be  any  multi¬ 
dimensional  square  real  matrix. 

It  should  be  noted  that  the  conditions  imposed  on  the  system 
models  in  Sections  4.  2,  4.  3,  4.  4.  2  and  4.  4.  3  are  sufficient  to  insure  that 
the  mean-square  error  is  a  monotonic  ally  increasing  function  of  time  for 
the  optimization  problem  considered.  However,  they  are  not  necessary; 
there  exist  more  general  systems  which  display  monotonicity  without 
satisfying  the  above  conditions. 

2. 4  The  Estimation  Scheme  and  the  Content  of  Messages 

We  assume  that  at  each  sensor  node  #i  there  is  a  local  linear 
least-squares  estimator  which  calculates  the  current  state  estimate 
£.(t)  given  the  previous  local  observations: 

Xj(t)  =  E  (x(t)  |  y.(s),  tQ  s  s  st).  (2. 52) 

The  local  error  covariance  matrix  P.(t)  associated  with  the 
optimal  estimator  is  given  by  the  matrix  Riccati  differential  equation  [6]: 

fyt)  =  AP^t)  +  P.(t)A«  +  BQB'  -  Pi(t)  C!  Rj1  C.  P^t) 

Pj(t  )  given  (2.53) 

In  this  thesis,  we  will  deal  for  the  lim  t  -♦  -®  case;  that  is,  we 
assume  that  the  process  has  been  under  observation  for  a  long  time.  In 
this  case,  due  to  our  assumptions  of  the  stabilizability  of  the  pair  (A,B) 
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and  the  detectability  of  the  pair  (A,  C.),  the  solution  of  the  Riccati 
equation  (2.52)  approaches  a  steady-state  constant  value  F.,  which  is  the 
unique  nonnegative-definite  symmetric  solution  of  the  albegraic  matrix 
Riccati  equation: 

0  =  AF.  +  F.A’  +  BQB'  -  F.CIR.’1  C.F.  (2.54) 

i  i  iiiii 

The  messages  sent  by  the  nodes  at  discrete  times  are  assumed  to 
contain  all  the  information  about  the  observations  taken  by  the  node  since 
the  last  message  was  sent,  plus  all  the  information  received  from  the 
other  nodes  during  this  interval.  The  content  of  messages  for  transfer 
of  information  with  no  loss  of  optimality  (that  is,  reduced  sufficient 
statistics  which  lead  to  the  same  estimation  performance  as  would  the 
actual  set  of  observations)  is  discussed  in  Chapter  3.  In  short,  in  our 
model  there  is  no  loss  of  information  due  to  the  communications  between 
the  nodes;  there  is  only  delay  of  information, 

2.  5  Model  for  the  Communication  Delays 

Message  delays  between  nodes  in  a  communications  network  in 
general  exhibit  a  stochastic  behavior.  In  this  thesis,  with  the 
exception  of  Section  5.2,  we  will  approximate  this  behavior  with  their 
average  values,  for  the  sake  of  mathematical  tractability.  In  particular, 
except  for  Section  5.3,  we  assume  that  the  delay  on  link  (i,j)  is  a  convex 
and  monotonically  increasing  function  of  the  average  traffic  rate  on  that 
link  in  messages/unit  time,  with  >•  finite  escape  value  C^  which  will  be 
defined  as  the  capacity  of  the  link.  A  possible  dependence  of  d.^,  the 
communications  delay  on  f.j,  the  average  traffic  rate  is  illustrated  in 
Fig.  2. 1. 
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The  Obiective  Functions 


The  goal  is  to  have  the  best  mean-square  estimate  of  the  state  of 
the  system  at  the  destination  node.  But  since  statistical  information 
carried  by  messages  from  sensor  nodes  arrives  at  the  destination  node 
at  discrete  times,  the  mean- square  estimation  error  is  a  time-varying 
quantity.  For  the  purposes  of  optimization,  it  is  desirable  to  try  to 
express  the  performance  in  terms  of  a  scalar.  For  the  problem  under 
consideration,  the  time-average  and  the  maximum  value  of  the  mean- 
square  estimation  error  at  the  destination  are  meaningful  possibilities. 
Again  for  the  sake  of  mathematical  tractability,  we  will  prefer  the  latter 
criterion  in  Chapter  4,  where  the  main  optimization  results  are  presented. 
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CHAPTER  3 

DECENTRALIZED  ESTIMATION  ISSUES 


3. 1  Introduction 


Consider  the  system  configuration  shown  in  Fig.  3.  1 


Fig.  3. 1  System  configuration  for  Chapter  3. 


There  are  two  sensor  nodes,  #1  and  #2,  and  a  destination  node, 
#3.  The  sensor  nodes  observe  independently  a  system  driven  by  w,  a 
Brownian  process  of  intensity  Q: 


dx(t)  =  Ax(t)dt  +  Bdw(t)  . 


(3.1) 


The  measurements  obtained  by  the  nodes  #1  and  #2  are  denoted  as: 


dy.(t)  =  C.x(t)dt  +  dv,(t),  t  *  0 


(3.2) 


where  v^  and  Vg  are  independent  Brownian  motion  processes  such  that: 


dv.(t)  _ 

ldvi 


(s)  dv^(s) 


O  R, 


dt.  (3.3) 


l*  W.Wlv 


»  •’.  **,  •*,  -  , 
-'ji  'j  '-r 
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Pi 
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The  filtered  minimum  mean-square  estimate  of  the  state  x(t)  is 
desired  at  the  destination  node.  We  will  denote  it  as: 


[  x(t> 


y^s), 


0  S  s  S  t 


y2(s),  0  £  s 


(3.4) 


We  assume  that  x(0),  and  the  corresponding  error  covariance 
matrix,  P(0),  are  available. 

There  are  two  strategies  to  obtain  x(t)  at  node  #3  [2]: 

(i)  Centralized  Approach:  Nodes  #1  and  #2  send  their  measure¬ 
ments  to  node  #3  and  node  #3  runs  a  continuous-time  Kalman  filter  to 
obtain  x(t). 

(ii)  Decentralized  Approach:  Nodes  #1  and  #2  perform  local 
information  processing  and  send  sufficient  statistics  to  node  #3  which 
utilizes  these  to  obtain  x(t). 

The  clue  to  finding  the  correct  approach  lies  in  the  amount  of 
information  that  has  to  be  transmitted  between  the  nodes.  For  a  realistic 
modelling  of  the  communication  links  with  finite  channel  capacity,  it  is  not 
possible  to  transmit  information  continuously  with  finite  delay.  Rather, 
messages  must  be  coded  and  transmitted  starting  at  discrete  points  in 
time.  For  the  estimation  problem  considered  here,  let  t^  be  the  time 
the  j**1  message  is  sent  from  node  #i  to  the  destination.  For  optimum 
state  estimation  at  the  destination  node  with  discrete-time  communications, 
node  §i  must  send  at  time  tj  some  information  wh»  h  will  lead  to  the  same 
estimation  performance  (in  terms  of  the  mean-square  error  at  all  points 
in  time)  as  if  it  had  sent  all  its  observations  since  the  last  message: 

[y.(t),  t^~*  £  t  s  tj}.  (This  assumes  no  messages  are  lost  during 
transmission.  If  messages  may  be  lost,  then  information  equivalent  to 
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{y^t),  0  £  t  s  tj}  must  be  sent.)  Since  sending  ty.(t),  tj  1  st  *t|}  is 
not  possible  due  to  capacity  limitations  (or,  assuming  the  observations 
are  also  made  at  discrete  times  much  more  frequently  than  messages  are 
sent,  sending  all  the  observations  is  impractical  in  view  of  the  large 
volume  of  data),  doing  more  local  information  processing  and  sending 
shorter  messages  which  give  the  same  estimation  performance  is 
desired. 

3.  2  Local  Information  Processing  for  Decentralized 

Estimation  with  Asynchronous  Reporting  Times 

An  algorithm  for  optimal  decentralized  estimation  with  decentral¬ 
ized  processing  and  finite  length  messages  have  been  developed  [2],  [11] 
for  synchronous  reporting  times,  that  is,  for  =  t^  £  t1.  According  to 
this  algorithm,  x(t^)  is  obtained  as  follows. 

At  time  t-1,  the  sensor  nodes  send  x.it1),  i  =  1,  2,  the  local  m.  s.  e. 
estimates,  and  r^t'1),  i  =  1,  2,  a  term  needed  to  correct  correlation  of 
Xj  and  ^  ^ue  to  &(0)  and  wit),  whose  statistical  specif iciations  are 
common  knowledge  to  both  sensor  nodes.  &.(b*)  and  r.(tP)  are  obtained  by 


• 

fyt)  = 

[A  -  Pj(t)Cj  Rj 1  CjJXjit)  +  P.(t)C!  Rj1  y.it),  i  =  1,2 

(3.5) 

fyo)  = 

x(0),  i  =  1,2 

(3.6) 

tjit)  = 

[A  -  P(t)iC j  Rj1  Cj  +  C^R^C^Jrjit) 

+  [P(t)P”1(t)  -  I]  BQB'  P^itjxjit).  i  =  1,2 

(3.7) 

rL(0)  + 

r2(0)  =  -  x(0) 

(3.8) 

Pit)  = 

AP(t)  +  Pit)A>  +  BQB'  -  PitHCjRj1  Cl 

+  C^R’1  C2)Pit),  P(0)  given 

(3.  9) 

;a/.V v  ;.\v;  .v.;.  /. 


P.(t)  =  AP.(t)  +  PjltJA'  +  BQB'  -  PjWC'jRjC.P.tt),  1=1,2 


(3.  1C 


P^O)  =  P{0),  1  =  1,2 


(3.  11 


Then: 


=  r^fr1)  +  r2(t>)  +  Ptttyp"1^)^)  +  P*1^)^^)] 


(3. 15 


Now  we  will  describe  the  extension  of  the  above  solution  to  the 

case  of  asynchronous  reporting  times.  Again  let  tj  and  t2  be  the  jth  ar 

th  k 

the  k  reporting  times  of  sensor  nodes  #1  and  #2  respectively.  Let  t2 

be  a  particular  reporting  time  of  node  #2.  Other  relevant  times  are 

defined  below  and  in  the  timing  diagram.  Fig.  3.2. 


•i  -  in',n  *i 

«i *  l2 


(3.13 


tj  a 

1  = 


max 

t3  stk*+1 
X1  X2 


(3.  14 


i  k*  k*+ 1 

We  assume  that  there  exist  some  t-  In  the  interval  [t2  ,  t2  ]. 
Otherwise,  the  solution  can  be  applied  by  exchanging  the  roles  of  the 
nodes  #1  and  #2.  We  will  specify  the  sufficient  statistics  that  must  be 
sent  by  the  sensor  nodes  #1  and  #2  and  the  necessary  computations  that 
must  be  carried  out  at  the  destination  node  #3  to  find  the  optimal  x(t)  at 

-j*  t,*  +  i 

node  #3  during  the  interval  [t^  ,  t2  ],  This  interval  is  the  smallest 
period  that  captures  all  the  characteristics  of  the  solution. 

Theorem  3,  1:  Sufficient  statistics  for  asynchronous,  decentral 
ized  linear  estimation  are  given  as  follows: 


times  from  ’* 

node  #2 


Fig.  3.2  Departure  times  of  the  messages 
from  the  sensor  nodes. 


I.  Node  #2  at  sends: 


1.  X2  (tg 

)  from: 

*2<‘> 

=  [A  - 

P2(t)C^R‘1C2]x2(t)  +  P2(t)C^R21 

y2(t) 

(3.  15) 

i2(0) 

=  x(0) 

(3.  16) 

k* 

r2  ^2 

)  from: 

*2<« 

»  (A  - 

P(t)(C [  Rj1  Cj  +  R"1  C2)]  r2(t) 

+  [P<t)P"1(t)  -  I]  BQB*  P~l(t)x2(l)  (3.17) 

r2(0)  =  0  (3.18) 

P(t)  =  AP(t)  +  P(t)A'  +  BQB'  -  PUMC^Rj1  Cj 

+  C'2  Rj1  C2)  P(t),  P(0)  given  (3.  19) 

P2(t)  =  AP2(t)  +  P2(t)A'  +  BQB'  -  P2(t)C^R'1C2P2(t) 

(3.20) 

P2(0)  =  P(0)  (3.21) 

3.  Additional  statistic  necessary  for  optimal  computation  of 

&(t)  at  the  destination  before  tj  .  This  statistic  would  be  named 

k*  i** 

g2  (t2  )  and  is  the  counterpart  of  g^  (tj  )  described  in  the  following 

entries. 

II.  Node  #1  at  tj  sends: 

■U. 

1.  (t^  )  from: 


&j(t)  =  [A  -  P^ttCj  R/CjlXjft)  +  PjtOC*  yi( 


(3.22 


^(0)  =  fc(0) 


rj  (tg  )  from: 


r1(t)  =  [A  -  P(t)(C'1Rj1C1  +C^R21C2)]r1(t) 


+  [P(t)P‘A(t)  -  I]  BQB'  P^(t)  xx(t) 


r^O)  =  -  x<0) 


with  P(t)  from  (3.  19)  and: 


(3.24 


(3.2E 


Pj(t)  =  APjft)  +  PjWA*  +  BQB'  -  PjdJCjRj1  CjP^t) 


P1<0)  =  P(0) 


(3.2« 


(3.21 


(tj  )  from  (3.22),  (3.23): 
gj  (tj  )  from: 

gj(t)  -  [A  -  S^(t)  C'jR^Cj]  gj(t) 

+  [SjWPj^t)  -  I]  BQB'P’^t)  x j  (t) 
k* 

«!<$>  *  0 


(3.  21 


with  P^(t)  from  (3.  26),  (3.  27)  and 

S,(t)  =  AS,  (t)  +  S,  (t)A'  +  BQB'  -  S,  (t)C!  Hi1  C,  S,  (t) 


with  P(t)  from  (3.  19). 


$ 


u 


IS 


ts 


k  i 

III.  Upon  receiving  messages  sent  at  t 2  and  ,  node  #3 
computes: 

1.  $(t2  )  £  E[x  ly^s),  0  s  s  <  tj  ;  y2(s),  0  «£  s  s  t£  ] 

.k* 


r,<'2  >  +  +  p<‘2  )lPI1(,2  )S1(,2’) 


k*  - 1  k 


-1  k*  *  k* 

+  P^lt*  )  52(t“  )1 


(3.32) 


1  ^ 

2.  )  A  E[x  |yj(s),  0  s  s  s  t|  ;  y2(s),  0  s  s  «  t*  ] 

♦  3^C  .jjf  ,  jjc  ,3$C  ,jjj 

g^tj  )  +  h2(tj  )  +  Sjdj  )  [Pj1^  )x1(tJ1  ) 

,  jjg 

+  Z*1^  )x2(t|  )]  (3.33) 


with  P(t)  from  (3.  19);  P^t)  from  (3.  26).  (3.27);  P2(t) 
from  (3.20),  (3.21);  S^t)  from  (3.  30),  (3.  31);  g^t) 
from  (3.28),  (3.29)  and 


h2(t)  =  [A  -  SjdlCJRj  Cj]  hg(t) 

+  [SjWZ^d)  -  I]  BQB'  Z~2l(l)  x2(t)  (3.  34) 

(3.  35) 


k*  a  k* 

h2(tp  =  -£(t£) 


x2(t)  =  Ax2(t) 


x2(t2  )  =  $2(t2  ) 


(3.36) 

(3.37) 


Z2(t)  =  AZ2(t)  +  Z2(t)A’  +  BQB' 


k*  k* 

22(t2  )  =  P2(t2  > 


IV.  Node  #1  at  tj,  j*  +1  g  j  £  j**  -  1  sends: 

1.  ^(tj)  from  (3.22),  (3.23). 

2.  g^tj)  from  (3.28),  (3.29). 

«  }J(  J 

V.  Upon  receiving  the  message  sent  attj.j  +1  <  j  s  j 


mode  #3  computes: 


*  ^ 

*l(tj>  £  Etx  ly^s).  0  s  s  s  tj;  y2(s),  0  <;  s  s  t£  ], 


which  is  given  by  (3.  33)  with  all  terms  evaluated  at  t  =  t j  and  all  other 
terms  defined  in  Part  III. 


VI.  Node  #1  at  tj  sends: 

A  i** 

1.  JCjitj  )  from  (3.22),  (3.23). 

.jjejjc 

2.  g  (tj  )  from  (3.28),  (3.  29). 

3.  r^tj  )  from  (3.24),  (3.25). 

k*+  1 

VII.  Node  #2  at  t„  1  sends: 


)  from  (3. 15),  (3.16). 
r2(tj  )  from  (3. 17),  (3.18). 

A  k*+  1 

5L(t,  x)  from  (3.  15),  (3.  16) 
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4.  g2^2  +1)from: 

g2(t)  =  [A  -  S2(t)  C^Rj1^]  g2(t) 

+  [S2(t)P21(t)  -  I]  BQB'  P2l(t)  x2(t) 


(3.41) 

(3.42) 


with  P2(t)  from  (3.20),  (3.21)  and 

§2(t)  =  As2(t)  +  s2(t)A'  +  bqb*  +  s2(t)q  Rg1  c2s2(t) 


(3.43) 

(3.44) 


with  P(t)  from  (3. 19). 


Proof:  To  show  that  the  mean-square  error  at  node  #3  during  the 
i  ^  k*  +  1 

interval  [t“  ,  t2  ]  by  this  method  is  the  same  as  the  case  with  full 

observation  transmission,  it  suffices  to  show  that  they  are  equal  at  the 

departure  times;  since  in  between  these  times  the  same  optimal 

prediction  will  be  performed  at  the  destination  node. 

In  [2]  and  [11]  it  was  shown  that  computing  &(t)  from  x.(t)  and  r.(t), 

as  given  in  (3.  5)  through  (3. 12),  is  equivalent  to  computing  it  from  y.(t). 

These  equations  are  the  basis  for  the  solution  presented  here  also. 

~  i* 

x ^(tj  ),  which  is  defined  in  (3.  33),  is  computed  as  follows:  r.(t),  the 
correction  terms,  are  defined  by  the  differential  equations  (3.7)  which 
are  driven  by  y.(t),  the  observations.  Thus  they  have  to  be  calculated  at 
the  sensor  nodes,  except  that  they  can  also  be  calculated  at  the  destination 
node  when  there  are  no  observations  available,  provided  the  initial 
conditions  can  be  obtained.  The  Interval  [t2  ,  t"  ]  is  such  an  interval 


50 


when  the  observations  y2 (t)  are  not  available  at  the  destination  node  by 

any  means.  Consider  the  initial  condition  (3.  8).  It  states  that  the  sum 

of  the  initial  values  of  the  correction  terms  must  be  equal  to  the 

k*  a  k* 

negative  of  the  global  state  estimate.  Node  #2  at  time  t2  sends  x2(t2  ) 
and  Tgitg  )  and  node  #1  at  time  t|  sends  &j(t2  )  and  r^tg  );  thus 
&(t2  )  is  obtained  at  the  destination.  For  the  interval  [t£  ,  tj  ],  node 
#1  runs  the  differential  equation  for  g ^  (t)  (which  is  also  a  correction 
term  like  Tj(t),  but  one  which  takes  into  consideration  that  the  observa¬ 
tions  of  node  #2  are  not  available  for  this  interval,  hence  R2  =  <*>  or 

■j* 

Cg  =  0)  with  the  initial  condition  of  g^t^  )  =  0,  since  the  global 
estimate  is  not  available  there.  The  destination  node  runs  the  differen¬ 
tial  equations  of  the  predicted  local  state  estimate  at  node  #2,  x2(t), 
using  the  initial  condition  &(t2  )  and  the  corresponding  correction  term 
h^t),  with  the  initial  condition  h2(t2  ^  =  "  ^*2  mak*n6  use  the 
global  estimate  already  computed.  As  a  result,  utilizing  the  received 
values  of  Xj (tj  ),  g^tj  )  and  the  computed  values  of  x2(t^  ),  h2(t*  ),  the 
destination  node  #3  calculates  Xj(t^)  as  in  (3.  33),  which  is  a  variation  of 
(3.12). 

For  the  computation  of  x^(tj),  j*  +  1  s  j  s  j**  -  1,  as  defined 

in  (3.40),  node  #1  needs  only  to  send  Xj(t^)  and  gj(tj),  which  is  the 

correction  term  which  takes  into  consideration  the  fact  that  observations 

of  node  #2  are  not  available  for  the  period  ft£  ,  t^].  On  receiving  these 

^  ^  1  ■* 

statistics,  node  #3  computes  jc(t'j)  as  described  for  x.(tj)  above. 

At  t  =  t|  ,  node  #1  sends  &(tj  )  and  g^(tj  )  for  the  calculation  of 

Xj(t|  ),  and  in  addition  sends  r^(tj  )  which  is  necessary  for  the  calcula¬ 
te*  +  i 

tion  of  t2  .  These  statistics  are  counterparts  of  the  statistics  sent 

k  '  If1'  -f-  | 

by  node  #2  at  t2  .  At  t  =  ^  ,  node  #2  sends  the  counterparts  of  the 


statistics  sent  by  node  #1  at  t|  .  By  symmetry,  the  roles  of  the  two 
nodes  are  reversed,  but  the  procedure  is  the  same  for  the  next  interval. 

Q.  E.  D. 

As  it  will  be  apparent  from  an  examination  of  the  algorithm,  the 
common  knowledge  required  at  the  sensor  nodes  consists  of  the  state- 
space  stochastic  model  of  the  system,  the  local  observation  models 
(including  noise  intensities),  and  the  reporting  times  for  both  sensors. 

It  is  possible  to  extend  the  above  algorithm  to  the  case  of  a  network  of 
sensors,  with  the  common  knowledge  including  the  local  observation 
models  and  reporting  times  of  all  nodes. 

For  the  sake  of  simplicity,  we  have  ignored  the  delays  that  are 
incurred  by  the  messages.  This  simplification  does  not  sacrifice  any 
generality,  and  if  the  messages  are  marked  by  their  departure  times, 
their  contents  can  be  utilized  retroactive  to  that  time  at  the  destination. 
The  algorithm  still  leads  to  the  same  estimation  performance  to  the  one 
that  would  have  been  achieved  had  the  messages  been  carrying  all  the 
previous  local  observations. 

With  the  existence  of  at  least  one  set  of  finite  dimensional 
sufficient  statistics  established,  we  now  turn  to  the  primary  topic  of 
concern  -  how  to  regulate  the  flow  of  messages  in  the  network  to  achieve 
optimal  performance. 


CHAPTER  4 


WORST-CASE  MINIMAX  OPTIMIZATION  IN  THE  STEADY  STATE 

4. 1  Introduction 

In  this  chapter  we  will  consider  the  situation  where  the  local 
estimation  processes  at  the  sensor  nodes  have  reached  the  steady-state 
in  the  sense  of  Section  2.4.  The  objective  will  be  to  minimize  the 
maximum  value  the  mean- square  error  takes  over  time  at  the  destination 
node.  In  Section  4.  2  we  will  consider  a  specific  system  configuration 
with  two  sensor  nodes  and  a  destination  node  in  order  to  illustrate  the 
basic  approach  of  this  chapter.  Optimization  with  the  phase  difference 
between  the  reporting  times  of  the  two  sensors  as  a  control  variable  as 
well  as  the  frequencies  of  the  reports  will  be  contrasted  with  optimization 
with  respect  to  the  frequencies  only.  In  the  latter  approach  the  optimiza 
tion  will  be  done  for  the  worst-case  phase  difference  that  gives  rise  to 
the  highest  possible  mean-square  error  at  the  destination  node. 

In  Sections  4.  3  and  4.  4,  we  will  study  the  extension  of  the  worst- 
case  minimax  optimization  approach  to  networks  with  arbitrary  number 
of  sensor  nodes  and  a  destination  node.  In  Section  4.  3,  we  will  allow 
any  given  sensor  node  to  send  messages  to  only  one  other  node.  The 
resulting  solution  to  the  optimization  problem  will  be  compared  in  its 
simplicity  with  the  solution  to  Section  4.4,  in  which  the  sensor  nodes  can 
send  messages  on  more  than  a  single  path.  The  solution  to  the  problem 
will  require  nonlinear  programming. 

4.  2  Special  Case  With  2  Sensor  Nodes 
4.2.1  The  Problem  Description 
Consider  the  network  structure  of  Fig.  4.  1. 


«•  «*  «*. 
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Sensor 
(Node  #2) 


Fig.  4.1  System  configuration  for  Section  4.  2. 

In  Sections  4.2  and  4.  3  we  will  consider  a  scalar  dynamical 
system  with  scalar  observations: 

dx(t)  =  Ax(t)dt  +  Bd  wit)  (4.1) 

dy.(t)  =  C.x(t)dt  +  dVj(t)  (4.2) 

i  =  1,2. 

Here  x(t)  is  the  state  and  y.(t)  are  the  observations.  w(t)  and 
v.(t)  are  uncorrelated  zero-mean  Wiener  processes  with  intensities  Q  and 
R..  In  addition,  the  pair  [A,  B}  is  stabilizable  and  the  pairs  {A,Cj}  are 
detectable. 

The  links  have  the  delay  models  described  in  Chapter  2  with 
di  =  f^(Tj)  and  d ^  =  f2(T2),  where  is  the  intermessage  period  on 
link  (1,  3)  and  T2  is  the  period  on  link  (2,  3).  The  nodes  make  continuous 
measurements,  run  continuous-time  Kalman  filters,  and  send  sufficient 
statistics,  which  summarize  their  observations  since  the  last  report. 


to  the  destination  node  every  and  Tg  seconds,  respectively. 

The  objective  is  to  minimize  the  maximum  value  over  time  of 
the  mean-square  error  at  the  destination  node.  In  Section  4.2.2  the 
control  variables  will  be  T^,  Tg,  and  A,  the  phase  difference  between 
the  reporting  sequences.  In  Section  4.  2.  3,  the  optimization  will  be  done 
over  Tj  and  T g  for  the  worst  case  A. 

4.2.2  Optimization  With  Respect  to  T^,  Tg  and  A 

In  this  section  we  will  consider  minimizing  the  maximum  peak 
value  of  p(t)  A  mean-square  estimation  error  at  the  destination  mode. 

We  will  assume  throughout  this  section  that  Tj/Tg  =  k^kg  for  some 
integers  k^  and  kg  so  that  the  waveform  for  p(t)  is  periodic.  Since  the 
set  of  rationals  is  dense  in  the  set  of  reals,  we  lose  nothing  by  this 
assumption. 

The  points  in  time  at  which  the  messages  are  sent  from  the  two 
sensor  nodes  and  received  at  the  destination  node  are  labeled  as  shown  in 
Fig.  4.2. 

til 

For  example,  the  j  message  from  Node  #2  leaves  Node  #2  at 
t^2  and  arrives  at  the  destination  at  t^g.  According  to  this  notation,  the 
phase  difference  A  is  defined  as: 

A  =  mtn  [tj2  *  ldJ  for  all  i,  j  (4. 3) 

ld2  1  ldl 

As  an  aid  for  visualization.  Fig.  4.  3  is  an  example  which  shows 
the  waveform  of  p(t)  for  the  system  model: 

dx(t)  =  dw(t), 

dy^(t)  =  x(t)dt  +  dv^t), 

dyg(t)  =  x(t)dt  +  dVg(t), 


(4.  4) 


Node  #1  sends  its  message 


Message  received  at  the  destination 


Fig.  4.2  Departure  and  arrival  times  of  the 
messages  from  the  sensor  nodes. 
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(4.  10) 


Fig.  4.  3  Mean-square  estimation  error  vs  time  for  the  system  of  Eq.  (4.  4) 


T)  (a',b')  =  traces  (a’.b') 

m  m 


(4.  11) 


S  (a',t)  =  AS  (a\t)  +  S  (a',t)A»  +  BQB' 

m  mm 


S  (a’,  0)  =  P  (a1) 
m  m 


(4.  12) 


Therefore,  P  is  the  steady- state  error  covariance  matrix  that 


results  from  observations  by  both  sensor  nodes.  It  satisfies  the  matrix 


algebraic  Riccati  equation  (4.10).  Pm(a')  is  the  error  covariance  matrix 


that  results  when  only  the  observations  for  sensor  node  #m  (m  =  1,2)  are 


available  for  the  last  a*  seconds.  It  satisfies  the  Riccati  differential 


matrix  equation  (4.  9).  S  (a'.b1)  is  the  error  covariance  matrix  result  - 

m 


ing  from  prediction  during  the  last  b’  seconds  after  getting  observations 


from  node  #m  for  a'  seconds.  S  (a*,t)  satisfies  the  linear  matrix 

m 


equation  (4.12). 


T1 
:  f  A 


In  each  period  of  p(t)f  there  are  k^  +  kg  peaks,  if  A  c  , 


where  and  kg  are  relatively  prime  integers.  From  (4.4)  and  (4.5) 


we  can  deduce  that  the  peak  values  in  the  period  denoted  as 
1  2 _ ,kt  *  k2»  . _ _ 


P  .  P  .  P 


can  be  expressed  in  the  form: 


pl  =  *m(ai)  +r?n(aI’bI)»  m  =  1.2 

i  =  1, 2,  •  •  • ,  kj  +  kg. 


(4.  13) 


•  • 

Here  m,  a1  and  b1  satisfy  one  of  the  following  3  conditions: 


Type  1: 


flifT1  +  dl  <  T2  + 
m  =  V 

[2  if  Tx  +  dt  >  Tg  +  < 


a1  £  l(T1  +  dj)  -  (Tg  +  d2) 


bl  =  min  (Tj  +  d^  Tg  +  dg) 


(4.  14) 


•  -VV'"  ' 

V' ■  ■ 


a  +  b  =  Tj  +  d1 


b  5  T2  +  d2 


(4.  16) 


Using  the  relationships  (4.  14)  through  (4.  16)  and  the  fact  that  $ 

m 

and  V  are  monotonically  increasing  functions  of  t,  with  77  increasing 
m  m 

faster  than  with  respect  to  its  second  argument,  we  can  see  that 
reducing  Tj  +  dj  and  T 2  +  d 2  results  in  a  reduction  of  the  values  of  the 
peaks  in  general.  In  particular,  it  can  be  shown  that  if  pQpt  is  the  value 
of  the  highest  peak  obtained  by  finding  the  optimum  A  for  a  given 
(T!  +  dl»  T,  +  d2)  pair,  then  p^pt  s  pQpt  for  a  (T'j  +  d*r  TJ,  +  d^)  pair 
if  T'^  +  d'^  «  Tj  +  d^  and  +  dj,  5  Tg  +  d2>  Hence  the  first  step  in  the 
optimization  procedure  is  to  minimize  Tj  +  dj/T1>  and  T2  +  d2^V 
individually,  again  remembering  that  the  delay  on  a  link  is  only  dependent 


on  the  traffic  on  that  link.  Therefore: 


T*  A  argmin  T.  +  d.  (T. ) 
1  T  >  0  1  11 


Tg  ^  argmin  T  +  d„(T„) 

t2>o  2  22 


As  mentioned  above,  Tj  and  T2  are  assumed  to  satisfy: 


(4.  17) 


(4.  18) 


^  4  *  -  •  -  .  •  .  -  . 


•  ,•  .*  .*/.*  V  ' *  * 

A. 


A.  A  '.'  .  A' 


O  *  v'  ,  *  *  ' 


for  some  relatively  prime  integers  k ^  and  kg. 

1  1  ^ 

Now,  using  the  values  T^,  d^(T^),  Tg  and  dg(Tg),  we  have  to  find 
A?  which  minimizes  the  value  of  the  highest  peak.  To  this  end,  we  will 
give  the  following  facts. 

Lemma  4. 1:  0  £  A  <  c  .  That  is,  we  only  have  to  carry  out  the 
search  for  Aop^  in  the  interval  [0,c  ). 

Proof:  Define  5(j)  as  the  time  difference  between  the  departure 
of  message  #j  from  node  #2  and  the  departure  of  the  most  recent 
message  from  node  #1: 


6(j)  A 


*2 

a2 


max 
i 


^  t 


dl 


(4.20) 


So,  according  to  this  terminology, 

A  £  min  5  (j)  (4.21) 

3 

Let  us  keep  the  periodic  departure  time  sequence  for  node  #1 

fixed.  As  we  shift  the  periodic  departure  time  sequence  for  node  #2  with 

respect  to  the  first  one,  the  value  of  A  will  increase  until  tl,  =  tl_  for 

dl  dz 

some  i,  j  when  A  becomes  zero.  Therefore  the  configuration  resulting 
at  the  upper  limit  for  A  is  the  same  as  that  of  A  =  0,  and  the  upper  limit 
for  A  will  be: 

Asup  =  mjn  6(j)  (4.22) 

6(j)  >  0 


for  this  configuration.  Let  us  mark  the  time  at  which  two  departures 
coincide  from  the  two  nodes  as  t  =  0.  Due  to  the  assumption  that  k,  and 


k2  are  relatively  prime  integers,  during  the  next  interval  of  p(t),  there 
will  be  kg  and  k^  departures  from  nodes  #1  and  #2,  respectively.  The 
times  for  these  departures  will  be  0,  2T^  •••,  kgT^  and 

0,  T^,  2Tg.  ••*,  k^Tg,  or  0*  kjC,  2kjC,  •••,  kg^c  and 
0,  kgC,  2kgC,  ••*,  k^kgC  for  nodes  #1  and  #2.  Let  us  label  the  depart¬ 
ures  from  node  #2  in  this  period  with  superscripts  j  =  1,  2,  •  •  • ,  k^  so 
that 

4d2  =  ^k2c  *  j  =  1,  2,  •  •  • ,  kj.  (4.23) 


Then: 


6(j)  =  (m^.kg  -  mkj)c  =  r^c 


(4.24) 


where  m.,  n.  and  r.  are  positive  integers  for  j=  1,  2,  W  .  Now 

3  J  J  1 

Tj  ^  r^  for  1  4  j.  i,  j  e  {l,  2,  •  •  • ,  k^},  because  otherwise  it  would  mean 

that  the  time  difference  between  the  departure  of  a  message  from  node 

#2  and  the  departure  of  the  most  recent  message  from  node  #1  are  the 

same  for  two  different  messages  from  node  #2  during  the  p(t)  period 

(0,  kjkgC],  This  in  turn  would  imply  that  there  is  a  smaller  period  of 

p(t)  within  the  period  (0,  k^kgC]  which  contradicts  the  fact  that  kj  and  kg 

are  mutually  prime  integers.  By  the  definition  of  6(3),  6(3)  =  r^c  <Tj  = 

kjC;  therefore  r.  e  {l,  2,  •  *  • ,  kj-l}forj»l,  2,  k^.  Together 

with  Tj  #  Tj  for  i  ^  3,  which  was  shown  above,  this  implies  that 

3  j  e  (1,  2,  •  •  • ,  kj}  such  that  r^  =  1  so  that  ®(j)  =  c.  This  implies  that 

A  =  c. 
sup  Q.  E.D. 

Lemma  4.  2:  The  "worst  case"  highest  peak  occurs  when  t^  = 
t^g  for  some  i.j,  as  exemplified  in  Fig.  4.4.  The  corresponding  phase 
difference,  denoted  by  Awc»  Is  given  by: 
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Awc  ■  c  t<d!  -  d2>  -  [d,  -  d2l) 


(4.25) 


where  [•  ]  function  gives  the  largest  integer  smaller  or  equal  to  its  real 
argument. 

Proof:  Let  us  say  that  at  a  point  in  time  t. _ the  arrivals  of 

wc 

messages  from  the  two  sensor  nodes  coincide,  i.  e.  twc  =  tj^  =  t^2  for 
some  i,  j.  Then  the  mean-square  error  at  the  destination  node  just 
before  their  arrival,  p(t~  ),  is  given  by: 

W  0 


p(t  )  =  $  (a  )  +  r?( a  ,  b  ) 

K  wc  m  wc  wc*  wc 


(4.26) 


where  $  and  rj  are  given  by  (4.  8)  through  (4.  12),  and: 
m  m 


1  if  Tj  +  dx  <  T2  +  d2 


2  if  T2  +  d,  <  T,  +  dx 


(4.27) 


|(Tj  +  dj)  -  (T2  +  d^l 


(4.28) 


min  (Tj  +  dj,  T2  +  d2). 


(4.29) 


so  that 


a  +  b 
wc  wc 


max  (T1  +  dj,  T2  +  d2) 


(4.  30) 


Relationships  (4. 14)  through  (4.  16)  show  the  range  of  arguments 
a*  and  bl  for  the  three  possible  types  of  peaks  of  p(t).  From  these 
relationships  it  can  be  deduced  that,  for  all  peaks  pl , 


a*  +  b*  £  max  (Tj  +  dj,  T2  +  d2). 


(4.  31) 


bl  £  min  (Tx  +  dy,  T2  +  dg). 


(4.32) 


Thus  the  relationships  (4.28)  through  (4.32)  imply  that  fc>wc  is  the 

maximum  value  that  the  second  argument  of  the  function  tj  can  take  and 

m 

that  is  the  maximum  value  that  the  argument  of  the  functions  $  can 
wc  °  m 

take  given  b  .  Since  ♦  and  tj  are  monotonic  ally  increasing  functions 

WC  III 

with  tj  increasing  faster  than  with  regard  to  its  second  argument, 
e(twc>  *  p»c- 

The  value  of  A  in  (4.  20)  can  be  shown  with  arguments  similar  to 
w  c 

those  given  in  the  proof  of  Lemma  4. 1. 

Q.  E.  D. 

Lemma  4.  3:  Consider  the  following  two  intervals  for  A; 

Interval  A:  [0,  )  (4.  33) 


Interval  B:  [A  ,  c) 
wc 


For  a  particular  Aa  e  Interval  A,  define: 
l 


where 


and 


PA1  ■  p(talA> 


A2  A. 

P  =  P(ta2  > 


falA  =  .  ar^ln  .  <4  -  4*  ^  ^ 

lal:  tal<t3a2 


a2 


argmin  (t‘j  -  t^l  V  U 

♦3  •  t^  <  t* 
a 2-  a2  lal 


!A1'  ^A2  £  2’  ^kl  +  k2^* 


(4.  34) 


(4.  35) 

(4.36) 


(4.  37) 


(4.  38) 


»] 
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Then, 


f 


A1 


max  p(t)  =  ^ 


or 

l 


(4.  39) 


A2 


for  all  values  of  A  s  Interval  A. 

1  ^B2 

Defining  p  and  p  analogously  for  a  particular 
Ag  e  Interval  B,  we  have: 

'  ‘B1 

max  p(t)  =  ^  or 

^B2 
P 
V 

for  all  values  of  A  e  Interval  B. 

Proof:  Assume  Tj  +  d1  >  Tg  +  dg.  For  A  =  A A  in  Interval  A, 


(4.40) 


with 


p"AI  .  *,(aS  +  ,(alAl,  bX. 

£  2 


*A1  *Al  _  _  ,  . 

a  +  b  =  Tj  +  d1# 


(4.41) 


(4.42) 


XA1 

Thus,  p  is  a  Type  3  peak  as  described  by  (4.  16).  From  (4.  5),  (4.  6), 

*A1 

(4.  35)  and  (4.  37),  b  is  the  maximum  value  the  second  argument  of  fy, 

can  take  at  all  Type  3  peaks  for  A  =  A^.  Since  $2  and  7^  are  monotonically 

increasing  functions  with  ^increasing  faster  than  with  respect  to  its 

•  2  6 
A 1 

second  argument,  p  is  the  highest  among  Type  3  peaks  for  A  =  A.. 

^Ai 

This  implies  that  p  is  the  highest  Type  3  peak  for  all  values  of 
A  in  Interval  A.  The  values  of  the  peaks  vary  continuously  with  respect 
to  A  in  Interval  A.  Therefore,  assuming  that  some  peak  p'  is  higher 

n 

than  p  *  for  some  A'  in  Interval  A,  there  must  be  A'  between  A .  and  A' 
*A1  *A1 

where  pT  =  p  .  Then  p'  and  p  belong  to  different  periods  of  p(t). 
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with  p' 


Ai 

=  p  for  all  A  in  Interval  A. 
Now,  for  A  =  A 


p  A2  =  $2 (a  A2)  +  T}( a  A2,  T2  +  d2). 


(4.43) 


A2 

Thus,  p  is  a  Type  1  peak  as  described  by  (4.14).  From  (4.5), 

^A2 

(4.6),  (4.36)  and  (4.38),  a  is  the  maximum  value  the  argument  of  4>2 

can  take  at  all  Type  3  peaks  for  A  =  A^.  Since  $2  and  r^are  monotonically 

^A2 

increasing  functions  with  Tj  with  respect  to  its  first  argument,  p  is 

2 

the  highest  among  Type  1  peaks  for  A  =  A  . .  Furthermore,  it  is  clear 

i  A 

A2  i  i 

from  (4.  15)  that  p  is  higher  than  all  Type  2  peaks  which  have  a  +  b  = 

^A2  £  A2 

^2  +  ^2”  *s  e<lual  to  the  second  argument  of  7J at  p  .  Thus  p 

is  the  highest  among  Type  1  and  Type  2  peaks  for  A  =  A  . . 

A 

A2 

Similarly  as  above,  it  can  be  shown  that  p  is  the  highest 
among  Type  1  and  Type  2  peaks  for  all  values  of  A  in  Interval  A.  Hence 
(4.  39)  has  been  established. 

For  a  particular  A  =  A„  in  Interval  B,  it  can  be  shown  that  no  Type  1 
*Bi  £B2 

peak  exists,  and  that  p  and  p  are  the  highest  among  Type  3  and 

Type  2  peaks,  respectively,  and  consequently  that  this  is  so  for  all  values 

of  A  in  Interval  B,  thus  establishing  (4.  40). 

For  T.  +  d.  s  T0  +  dot  the  proof  is  similar. 

11  1  Q.E.D. 


The  physical  interpretation  of  this  result  is  as  follows.  For 

^Al  £A2 

Interval  A,  p  and  p  are  the  two  "candidates"  for  being  the  highest 
^B  i  ^b2 

peak:  so  are  p  and  p  for  Interval  B.  Now  the  stated  result  says 
that  once  we  find  the  candidate  peaks  for  a  particular  value  of  A  in  the 
interval,  they  stay  as  candidate  peaks  for  all  values  of  A  inside  that 
interval.  Hence  we  only  have  to  consider  the  values  of  these  2  peaks 


rather  than  (k^  +  k2)  peaks. 


Lemma  4.  4:  Either  the  optimal  phase  difference  A  -  A  is 

zero,  or  at  A*,  p  =  p  ^  if  A*  e  Interval  A  and  p^*  =  p  ^  If 

* 

A  e  Interval  B. 

Proof:  From  (4.5),  (4.  6),  (4.  35),  (4.  36),  (4.  41)  and  (4.  42)  it 

/Ai 

can  be  shown  that  p  1  decreases  monotonically  as  a  function  of  A  in 

^A2  ^  A2 

Interval  A  and  similarly  that  p  increases  monotonically  with  p  = 

I  ^  A  9  ^  A 1 

p  >  p  1  at  A  .  Hence  either  p  >  p  for  all  A  e  Interval  A, 
*wc  .  K  ,  wc 

*A2  1 A l 

or  p  =  p  at  a  point  in  Interval  A.  Similar  facts  are  true  for 
Interval  B.  Hence  the  claim  of  the  lemma  follows  from  (4.39)  and  (4.40). 


Q.  E.  D. 


Now  we  can  summarize  the  optimization  procedure. 


Proposition  4.  1:  Minimization  of  the  maximum  peak  value  of 
p(t)  with  respect  to  T2  and  A  can  be  achieved  by  the  following 


algorithm. 


Step  1:  Find  and  T2  by  (4.  17)  and  (4.  18). 

Step  2:  Find  Intervals  A  and  B  by  (4.19),  (4.25),  (4.  32)  and 


(4.  34). 


Step  3;  Using  Aa  =  Awc/2  and  AB  =  (Awc  +  c)/2,  find  the 
candidate  peaks  with  indices  /Aj,  ^A2'  “*B1  anc*  ^B2’ 


Step  4:  In  Interval  A,  find  the  phase  difference  AAfor  which 

i  ^ a o  *  *  ^A1 

p  -p  _  p,.  If  p  ‘  >P  everywhere  in  Interval  A,  then  s 


=  f 

*  1  L 


everywhere  in  Interval  A,  then  set 


A^  =  0  and  P^  =  P  ^  for  A  =  0. 

Carry  out  the  same  procedure  for  Interval  B  also,  and  find  A 


♦ 

and  pR. 


Step  5: 


r  *  .  *  * 

I  A.  if  Pa  5 


*  m 
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4.2,3  Worst-Case  A  Optimization  with  Respect  to  T  ^  and 

Adopting  a  minimax  approach  and  optimizing  with  respect  to  Tj 

and  T„  for  the  phase  difference  A  which  gives  rise  to  the  highest 

possible  peak  simplifies  the  optimization  procedure  significantly. 

The  situation  when  the  worst  case  peak  occurs  was  depicted  in 

Fig.  4.  4  and  the  corresponding  A  was  given  by  (4.  25).  The  value  of 

w  c 

the  worst-case  peak  can  be  written  as: 


where: 


wc 


m  = 


a  = 

b  = 


=  #  (a)  +  T)(a,  b) 

m  m 

h  if  T1  +  dl<T2  +  d2 
12  if  T1  +  d1>T2  +  d2 

|  <Tt  +  dj)  -  (T2  +  d2)  | 

min  [Tj  +  dj,  T2  +  d2]. 


(4.  45) 


Now,  tj  increases  faster  than  $  with  respect  to  its  second  argu- 
m  m 

ment,  and  a  +  b  =  max  [T ^  +  d^,  T2  +  d2J  =  fixed.  Hence  minimizing 
Pwc  involves  first  minimizing  b,  then  a.  Thus  min  pwc=^>min  b  = 
min  [Tj  +  d^,  T2  +  dg]  first.  Then  min  a  =  |  (T ^  +  dj)  -  (T2  +  d^ )  | ; 
hence  min  max  [T1  +  dj,  Tg  +  dg].  Therefore  we  have  to  minimize 
Tj  +  dj  and  Tg  +  d2  individually.  Again  we  have  assumed  the  independ¬ 
ence  of  d^  and  d2. 

This  result  is  summarized  in  the  following. 

Proposition  4.  2:  Minimization  of  the  highest  peak  of  p(t)  for  the 

worst-case  phase  difference  A,  i.  e.  minimization  of  max  p(t)  with 

A.t 

respect  to  T^  and  T2  is  equivalent  to  the  minimization  of  T^  +  d^(T^) 
with  respect  to  T^  and  T2  +  d2(T2)  with  respect  to  T,,. 
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4.  3  A  Restricted  Routing  Solution  to  the  Worst-Case  Minimax 

Optimization  Problem  in  a  Sensor  Network 

4.  3.  1  The  Problem  Description 

In  this  section  we  are  going  to  consider  a  network  with  N  sensor 
nodes  and  a  destination  node.  Any  given  node  #i  has  a  set  of  neighbors 
0(0  for  which  links  (i,  j),  j  e  0(i)  exist.  Delay  on  link  (i,  j)  is  assumed  to 
be  a  function  of  the  traffic  rate  on  that  link  only:  d.^  =  f.^T.^),  where 
d.j  is  the  delay  per  message  and  T.^  is  the  intermessage  period  on 
link  (i,j). 

We  will  consider  a  restricted  class  of  routing  schemes  in  this 
section.  Specifically,  any  given  node  will  be  allowed  to  send  messages 
to  only  one  of  its  neighbors,  for  all  time;  and  these  messages  will  be 
sent  periodically.  The  general  case  is  considered  in  Section  4.  4. 

There  is  assumed  to  be  one  destination  node,  and  the  objective  is 
to  minimize  the  maximum  value  over  time,  or  the  highest  peak,  of  the 
mean-square  state  estimation  error  at  the  destination  node.  We  will 
optimize  for  the  worst-case  timing  relationship  that  can  possibly  exist 
between  the  reporting  times  of  the  nodes,  when  one  set  of  messages  for 
all  the  nodes  arrives  at  the  destination  node  simultaneously  with  the 
greatest  total  delay.  Therefore,  here  the  worst  possible  mean-square 
error  with  given  T.^,  hence  d.^  values  is  implied.  This  worst  case,  by 
extension  of  the  result  of  Case  (2)  of  Section  4.  2.  2,  will  occur  when  the 
messages  arrive  at  node  #j  almost  at  the  same  time,  but  only  slightly 
later  than  the  departure  time  of  a  message  from  node  #j,  for  all  nodes  j. 


In  this  case  the  messages  which  have  just  arrived  have  to  wait  at 
node  #j  for  the  interreporting  period  of  this  node. 

For  all  schemes  of  routing,  it  is  assumed  that  optimal  data 


fusion  is  employed  at  the  intermediate  nodes,  described  in  Chapter  3. 


4,3.2  The  Algorithm 

Proposition  4,  3:  The  algorithm  for  minimizing  pwc  =  max  p(t) 
at  the  destination  node  for  the  worst-case  timing  relationship  between 
the  reporting  times  of  the  nodes  is  as  follows. 

1.  Minimize  T^  +  d.^(T^)  for  all  links  (i,  j).  Let  T*  £ 


argmin  T^  +  dy(Ty). 

;{s 

2.  Find  the  shortest  path  spanning  tree  Z,  using  T^  +  d_(T.^.)  as 
the  "length"  of  link  <i,  j). 

3.  Let  node  #i  send  its  messages  on  link  (i,  j),  (i,  j)  e  T ,  with  a 


period  of  T...  Do  this  for  all  nodes  of  the  network. 


Proof:  The  worst-case  highest  peak  will  occur  at  time  t'  =  t^  : 
t^2  =  •  •  •  =  t^N»  f°r  some  i,  j,  •  •  • ,  z,  according  to  the  timing  notation 
of  Section  4.2.  For  any  algorithm  which  conforms  to  the  restrictions 
we  have  posed,  let  node  #i  send  its  messages  periodically  with  period 


T_  on  link  (i,  j),  and  let  this  message  be  routed  over  intermediate  nodes 


#i,  #j,  #k,  ***,  #y,  #z.  Define: 


Di  5  (Tij  +  V  +  (Tjk  +  V  +  "•  (Tyz  +  V- 

(4.  46) 

Rename  the  nodes  so  that  D^  s  ^or  1  s  i  s  N.  Then  the  value 

of  the  worst-case  highest  peak,  p  ,  can  be  written  as: 

wc 

Pwc  ■  *N(DN-1-DN>  +  *®N>  <4-471 

!p  (t)  =  trace  P  (t),  m  =  2,  ••*,  N  (4.48) 

m  rp 


P  (t)  =  AP  (t)  +  P  (t)A'  +  BQB' 

m  mm 

JS 

-  >  P  (t)C'  Rj^C-P  (t)  (4.49) 

m  k  k  k  m 

k=m 


P  (0) 
m 


=  < 


Pm-1  ®m-2 


m  =  3, 

m  =  2 


N 


(4.50) 


N 

0  =  AP"  +  PA'  +  BQB'  -  2,  ^CkRk1<:k^  (4.51) 

k=l 


X(DN)  =  trace  S(Dn)  (4.52) 

S(t)  =  AS(t)  +  S(t)A'  +  BQB'  (4.53) 

S(0)  =  P n  ®N-1  ”  Dj^)  (4.54) 

Now,  X(t)  >  ^m(t)  with  \(0)  =  ^m(0)  for  all  t,  m  *  1,  •  •  • ,  N  and 
>  jft)  with  ip k<0)  =  <Pt(0)  for  all  t,  k  >  l.  Since  Dm's  can  be  mini¬ 
mized  independently  (due  to  independence  of  delay  from  traffic  on  different 
links),  minimizing  pwc  requires  minimizing  first  D^,  then  -  DN, 

hence  D^_1;  then  -  DN_1>  hence  and  so  on.  Therefore,  all 

D^'s  have  to  be  minimized  independently,  which  can  be  achieved  by  the 
algorithm  as  stated  above. 

Q.  E.  D. 

4.  4  A  General  Routing  Approach  to  the  Worst-Case  Minimax 
Optimization  Problem  in  a  Sensor  Network 

4.  4.  1  Introduction 

In  this  section,  we  will  lift  the  restriction  that  a  given  node  can 


send  messages  to  only  one  other  node.  The  nodes  will  be  allowed  to  send 


messages  over  several  different  paths  to  the  destination,  utilizing  some 
channel  capacity  which  may  have  been  left  unused  by  the  routing  solution 
of  the  previous  section.  This  way  a  node  will  be  able  to  send  informa¬ 
tion  more  frequently  with  the  same  effective  delay  or,  alternatively,  the 
same  amount  of  information  with  less  delay. 

As  explained  in  Section  2.  3,  in  Sections  4.  4.  2  and  4.  4.  3,  the  A 
matrix  in  the  dynamical  system  model  is  assumed  to  be  multivariable 
with  the  structure  indicated  in  (2.  30).  This  will  insure  that  the  mean- 
square  error  is  a  monotonic  ally  increasing  function  of  time  for  the 
optimization  problems  considered  in  these  sections. 

We  have  seen  in  the  last  section  that  the  worst-case  highest  peak 
of  the  mean-square  error  occurs  at  the  time  instance  when  the  messages 
from  all  the  nodes  happen  to  arrive  simultaneously.  This  is  because 
just  before  these  messages  all  arrive,  the  destination  node  has  not 
received  any  information  from  all  the  nodes  for  the  maximum  possible 
duration  of  time;  specifically,  for  T.  +  d^  units  from  node  #i,  where  d. 
is  the  total  delay  to  the  destination.  We  will  call  this  duration  the 
"effective  delay"  from  node  #i,  and  denote  it  by  D.. 

When  we  allow  the  messages  to  be  routed  over  multiple  paths,  the 
effective  delays  on  different  paths  will  In  general  be  different.  If  we 
label  the  departure  and  arrival  times  of  the  messages  in  the  order  they 
depart,  as  shown  in  Fig.  4.  5  (for  k  =  3),  then  the  effective  delay  from 
node  # i  will  be; 

Dj  =  max  (t^  -  max  t^)  forallj.k.  (4.55) 


Now,  by  rearranging  the  departure  times  on  the  paths  while 
holding  the  average  departure  frequencies  constant  and  also  adjusting  the 


Path  1 


Path  2 


relative  timings  between  the  paths,  we  can  reduce  D..  Obviously,  the 

minimum  will  be  achieved  when  all  (t^  -  max  )  effective  delays  are 

a  tk  <  t* 
la 

equal.  An  example  of  what  the  mean-square  error  looks  like  for  a  simple 

example  of  a  single  node  with  two  alternate  paths  on  which  the  timings 

have  been  optimally  adjusted  is  shown  in  Fig.  4.  6.  The  observed  system 

is  assumed  to  be  modelled  as  a  scalar  Wiener  process. 

In  Section  4.  4.  2,  we  will  derive  the  effective  delay  (D)  for  the 

optimum  adjustment  of  the  departure  times  on  the  paths  from  a  single 

node,  given  fixed  values  of  and  d-*  associated  with  the  paths.  In 

Section  4.  4.  3,  we  will  consider  minimizing  D  with  respect  to  T^.  In 

Section  4.4.4,  we  will  address  the  problem  of  minimizing  p  ,  the  worst- 

w  c 

case  maximum  peak  value  of  the  mean-square  error  for  the  general  net¬ 
work  case. 

4.4.2  Effective  Delay  from  a  Single  Node 

Let  us  assume  that  there  are  L  disjoint  paths  from  a  single  node 

to  the  destination;  average  frequency  of  departures  on  path  j  is  (T^) 

and  the  total  delay  on  that  path  is  d-*.  Let's  also  assume  that  there  exist 

12  L 

mutually  prime  integers  k  ,  k  ,  k  such  that: 


(T1)"1 


(T2)"1 


(TL)_1 


(4.  56) 


As  explained  in  Section  4.2.2,  this  assumption  is  not  an  unreasonable  one, 
since  the  set  of  rationals  is  dense  in  the  set  of  reals. 

Let  us  first  state  the  main  result  of  this  section,  then  prove  it. 
Proposition  4.  4:  The  effective  delay,  D,  from  a  single  node  for 
the  optimum  adjustment  of  the  departure  times  on  disjoint  paths 
1,  2,  ••*,  L  having  associated  average  departure  frequencies  (T-1)*1  and 


Path  1 


Effective  Delay  D 

Fig.  4.6  Effective  delay  D  from  a  single  sensor  node  with  two  paths  to  the  destination  node, 
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delays  for  path  j,  Is  given  by: 


D  =  D(V) 


(4.57) 


where  D(A)  is  defined  on  a  set  A  of  paths  as: 


D(A)  A  [  V  (Tj)_1J  [l  +  £  ^ j 
j  e A  j  eAT 


(4.58) 


The  set  V  is  defined  as: 


V  =  UY»  where  sets  Y  have  the  property: 

Y  =  {j  e  Cl.  2,  ....  L}:dj  <  D(Z)  V  Z  C  (l.  2.  •••  ,  L}}. 

(4.59) 

Proposition  4.  5:  The  set  V  of  Proposition  4.4  can  be  efficiently 
constructed  by  the  following  algorithm,  written  in  Pseudo-Algol: 

Begin 

Step  1:  A:  =  {l,  2,  •  •  • ,  L3 ; 

V:  =  <fr. 

Step  2:  For  j  =  1  thru  L,  j  e  A  do 
If  dj  <  D  (A  -  {j}) 

then  V:  =  V  U  C  j  3 ; 

Step  3:  If  V  =  A 

then  Stop; 
else  Begin 


A:  =  V; 

V:  = 

Go  to  Step  2 


End; 


End. 
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Before  proving  these  two  propositions,  let  us  introduce  some 
preliminary  results. 

Lemma  4.  5:  Let  A  and  B  be  two  disjoint  sets  of  paths,  and  let 
each  path  j  have  associated  T**  and  d^  values.  Then  one  and  only  one  of 
the  following  three  cases  holds: 


(a) 

*B 

<  D(A  (J  B) 

<  D(A) 

(4.  60) 

(b) 

*B 

>  D(A  U  B) 

>  D(A) 

(4.61) 

(c) 

=  D(A  U  B) 

=  D(A) 

(4.62) 

Here  D(A)  is  defined  on  set  A  as  in  (4.  58)  and  cL,  is  defined  as: 

Jt5 


t  I 

jeB 


l  (T1) 


(4.63) 


Proof:  By  trichotomy,  there  are  3  distinct  possibilities: 

(a')  D(A  U  B)  <  D(A)  (4.64) 

(b1)  D(A  U  B)  >  D(A)  (4.65) 

(C)  D(AUB)  =  D(A)  (4.66) 

We  will  prove  that  (a1)  is  equivalent  to  (a).  The  other  cases  will 
follow  identically  with  changes  in  inequality  and  equality  signs  throughout 
the  proof. 

The  proof  will  be  in  two  stages: 

(i)  D(AUB)  <  D(A)<£=>^b  <  D(A)  (4.67) 


cTr>  <  D(A  U  B)  <  D(A) 


(4.67) 

(4.68) 


9 


Proof  of  Part  (i):  Let’s  assume  that  A  =  {l,  2,  • 


B  =  {M+l,  M+2),  •  •  •,  N}. 
D(A  U  B)  <  D(A) 


N  N  ■ 

[j  (ri)-1] 1  [,  +  y 

3=1  j=l  i 


M 


<  tz  [■  *  i  Sjj 


M 


3=1 


3=1 


TJ 


M 


N 

m  n  t  n  n  t1 

(",  ^)C  ^<0 

]  =  1  1  j=M+l  r 


•=  [(f^XZ1^-)  -  (f^OG  — 


n  n  Tj 


M 

m  n  t 


t3 


M 


[‘*Z$J 


j=iTJ 


N 


N 


dj 


LL.  '  j 

j=M+l 

^  t3 

j=M+l  l 

M  , 

M 

H  Cl*) 

r  ^ 

L1  +  i 

3=1 

3=1 

H’g  <  D(A) 

Part  (ii)  can  be  proved  in  a  similar  way. 


Q.  E.D. 


Proof  of  Proposition  4.  4:  Let  us  first  introduce  just  one  more 
variable  for  added  clarity.  Define  C(A)  as  the  effective  delay  from  the 
sensor  node  for  the  optimum  adjustment  of  departure  times  on  the  paths 
which  are  elements  of  set  A.  Thus  the  claim  of  the  proposition  is  that 
D  A  D  ({l,  2,  L})  =  D(V). 

Suppose  we  are  given  the  path  set  V  as  V  £  tv*,  v2,  •••,  v^*}. 


satisfying  (4.59).  Reconstruct  V  as  follows: 


(4.69) 


Wn_1  (J  {vn},  2  £  n  ^  M. 


(4.  70) 


We  will  show  that: 


T3(Wn)  =  D(Wn),  1  *  n  *  M, 


(4. 71) 


and  that: 


D(Wn)  <  T3(Wn_1),  2  s  n  £  M. 


(4.  72) 


Then  we  will  show  that: 


D(V)  *  D(V  U  (j}), 


(4.  73) 


D(V)  =  DCV  u  {j})  v  j  t  V,  j  e  {l,  2,  •••,  L}. 


(4.74) 


This  will  complete  the  proof. 

11  2  2 

Let  us  consider  a  single  period  of  p(t),  of  length  k  T  =  k  T  = 

•  •  •  =  kLTL.  We  will  show  (4.  71)  and  (4.  72)  for  W1  and  W2.  The  other 
stages  will  be  similar  for  3  £  n  s  M.  For  v*  e  W*,  the  minimum 
effective  delay  occurs  for  the  equal  spacing  of  departure  times,  and  has 
the  value: 


BttV1)  =  TV  +  d' 


DCW1). 


(4.  75) 


30 


This  is  well  known  from  the  previous  sections. 

v1  v2 

By  (4.56),  there  will  be  k  messages  on  path  v1  and  kv 

2  2  v2 

messages  on  path  v  in  a  period  of  p(t).  Divide  path  v  into  k 

_  2  2  2  2 

separate  naths ,  1,  7,  •  •  •  kv  ,  with  period  kv  Tv  and  delay  dv  . 

i  2  i 

Consider  first  the  set  W1  U  {T1 .  By  (4.  59)  and  (4.  75),  dv  <17 (w). 
From  Fig.  4.  7  it  can  be  observed  that  in  this  case  the  effective  delay 
can  be  reduced  by  readjustment  of  departure  times  on  paths  v^  and  l". 
Therefore  T5(W^  U  {T})  <  C(W*).  Now  calculate  C(W*  U  {T}).  From 
Fig.  4.7(b),  in  the  interval  [0,  k^T^  +  dV  ]  there  are  kV  +1  overlapping 
subintervals  of  length  17(W*  U  {T}).  Thus  writing  the  entire  length  in 
terms  of  these  subintervals  minus  the  overlap  durations. 


(kvl  +  nrr  -  <kvl  -  i)dvl  -  dy2  =  kvlTyl  +  d' 


no^um)  =  f  ■  (Tyl  +  dvS  +  — * — dv" 


(4,  76) 


kV  +  1 


(Tvl)_1  +  (k^T^)'1 


kV  +  1 


'l  ,  dVl  ■  d"2  ^ 

1  +  - T  +  - PI - PT  I 


TV  kv  Tv 


=  DOV^CT}). 


(4.  77) 


Then,  by  Lemma  4.5: 


2 

i7  =  dv  <  DiW1  U  {T})  =  T7CW1  u  {T}) 


(4.78) 


and  similarly  as  above: 


CiW1  U  CT.7})  <  ffOV1  U  {!}) 


(4.  79) 


Path  v 


dvw1  u  rr}) 


(b) 


Fig.  4.7  Reduction  of  the  effective  delay  by  readjustment 
of  departure  times. 


Repeating  the  procedure  gives: 


C(W2)  =  D(W2)  and  U(W2)  <  DfW1) 


(4.  80) 


i  o 

Therefore  we  have  shown  (4.71)  and  (4.  72)  for  W  and  W  .  For 


11  . 

the  other  stages  the  proof  is  entirely  similar.  Given  dv  <  D(Wn_1), 


it 

we  can  show  every  new  message  up  to  number  kv  on  path  vn  reduces  the 
effective  delay.  D^(Wn)  can  also  be  calculated  as  in  (4.76),  subtracting 
the  overlap  durations  from  the  overlapping  intervals  of  n(Wn): 


kv  )  ET(Wn)  -  (kv  -  l)d 
i=l  i=2 


'l  -  1 


i  i  11  ] 

kV  dV  =  kV  Tv  +  dV 


(4. 81) 


>TT(Wn)  =  ( Y  kv  )  [kv  (tv  +  dv  )  +  2,  kV  dV  ] 
i=  1  i=2 

.  r?  (O'1]"1  [(T-1)-1(T-Vdv1)+- (Tvi)-ldvii 

Li=i 

[i  (t1'1)'1  rl  [. 

i=l 


i=2 


i=l  t 


D(Wn) 


(4.82) 


Now  we  will  prove  (4.  73)  and  (4.  74),  which  can  be  interpreted  as 
saying  that  set  V  is  the  largest  set  of  "useful  paths".  Consider  a  path 
j^V  =»  3  A  C  ( 1 ,  2,  ’•*,  L}  such  that  d**  s  D(A).  Define  G  as  the  set 
of  such  sets  A: 

G  A  {AC  fl,  2,  L}  :  dJ  2  D(A)}  (4.83) 

Let  B  be  the  set  which  satisfies: 


D(B) 


min  D(A) 


(4.  84) 
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If  B  =  V,  then  d3  £  D(V)=^D(V  (J  £ j }  s  D(V)  from  Lemma  4.5, 
giving  (4.  73). 

If  B  4  V,  then  write  B  as  B  =  Bv  tj  where  Bv  =  [i  cB  :  i  eV}  , 
T?v  =  {i  eB  :  i  t  V}.  If  Bv  =  <fr,  d3  *  D(B)  =  D(By)  *  D(V)  by  the  already 
proved  (4.  71)  and  (4.72).  So  in  fact,  B  =  V  if  IT  = 

If  S’  ^  $,  then  there  exist  two  possibilities  by  Lemma  4.5: 

(i)  cT-  a  D(B  US  )  =  D(B)  ^  d(B  )  (4.85) 

Bv  v  v  v 

(ii)  cf^  <  D(B)  <  D(Bv)  (4.86) 

v 

For  case  (i),  d3  *  D(B)  2  D(By)  *  D(V). 

For  case  (ii),  <  D(B)  =*  3  path  k  e  IT  such  that  d^  <  D(B). 

v  k  7 

Since  k  t  V,  a  set  C  such  that  d  >  D(C)  =>D(B)  >  D(C),  contradicting 
(4.84).  Thus  case  (ii)  is  impossible. 

Therefore  we  have  proved  that  (4.  73)  for  all  cases.  (4.  74)  follows 
directly  from  (4.  73).  D(V)  =  S(V)  and  d3  *  D(V);  therefore  d3  *  C(V). 

The  delay  on  path  j  is  greater  than  the  effective  delay  achieved  by  set  V; 
therefore,  inclusion  of  a  message  on  path  j  in  a  period  of  p(t)  cannot 
reduce  the  effective  delay  no  matter  at  which  point  in  the  period  it  is 
placed.  (4.74)  states  this  fact.  Q.  E.  D. 

Proof  of  Proposition  4.  5:  This  algorithm  picks  out  the  elements 
of  Cl,  2,  •  •  • ,  L}  which  are  not  in  V.  We  will  show  that  at  each 
iteration  of  the  algorithm  at  least  one  such  element  is  detected,  unless 
all  have  been  already  detected. 

At  the  iteration,  A  is  the  set  of  paths  which  haven't  been 
eliminated  in  the  first  n-1  iterations.  If  A  =  V,  then  d3  <  D(A  -  {j}) 

V'j  A  will  be  indicated.  Otherwise,  write  A  as  A  =  V  U  V,  where 
V  =  {jeAtj^Vj.  Let  V  =  (T,  "S’,  •  •  •  ,  M  j .  We  will  show  that 
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3  j  eV  such  that  <  D(A  -  {j}).  If  >  D(A  -  {j})  for  some 

j  ^  fT,  7,  •  •  • ,  M - 1 } ,  we're  done.  If  d^  <  D(A  -  [j] )  V  j  e  ff,  7 . M-l} 

we  have  to  show  that  d^  >  D(A  -  {M}).  We  have: 

D(V  U  [V  -  He}])  >  d^  V  7  e  {I,  7,  •••,  M^I},  (4.87) 

Then  from  Lemma  4.  5: 

-  {7}  ]  >  U  [^  ■  ])  >  d  V  7  e  {T,  7,  *  *  * ,  M-l} , 


(4.88) 

since  >  D(V).  Hence: 

3  j  e  (V  -  {7} )  such  that  d^  >  d^  V  k  e  fT,  7,  •  •  ■ ,  M-l} ,  (4.  89) 
which  implies  that  d^  >  d^  V  7  e  {T,  7,  •  •  • ,  M-l}.  Hence: 


M 


dM>^  >D(VUV)  =  D(A)  >  D(A  -  {Ml) 


(4.  90) 


by  using  Lemma  4.  5  twice. 


Q.  E.  D. 


4.  4.  3  Minimization  of  Effective  Delay  D  from  a  Single 
Node  with  Respect  to  Average  Departure 
Frequencies  on  Paths  to  Destination 

Proposition  4.  4  states  that  the  effective  delay  D  from  a  single  node 

for  the  optimum  adjustment  of  the  departure  times  on  the  paths  to  the 

destination  is  given  by: 

D  =  Ty  +  Hy  (4.91) 

where  Ty  is  the  inverse  of  the  sum  of  the  average  departure  frequencies 
on  the  paths,  or  it  is  the  "effective  period";  and  c[y  is  the  weighted 
average  of  the  delays  on  the  paths,  or  it  is  the  "effective  communication 
delay",  not  to  be  confused  with  the  "effective  (information)  delay"  D, 


> a' _  rf"—  ■**_  mk*—  lA. 


v-  * '  j- 


which  includes  the  holding  of  information  at  the  originating  node.  The 
subscript  V  indicates  that  T  and  3  are  defined  on  a  set  of  V  of  "useful 
paths".  For  fixed  given  values  of  and  d*'  for  path  j,  set  V  can  be 
constructed  by  the  algorithm  given  by  Proposition  4.  5. 

In  this  section,  the  minimization  of  D  with  respect  to  T-*  will  be 
investigated.  For  an  algorithm  is  used  to  minimize  D  =  D(V)  as  in 
(4.57),  at  each  iteration  the  set  V  has  to  be  found  by  using  the  algorithm 
of  Proposition  4.  5,  which  would  make  the  computational  complexity  very 
high. 

Rather  than  minimizing  a  functional  that  depends  on  a  set  which 
may  vary  every  iteration,  we  will  attempt  to  use  a  functional  which 
depends  on  the  fixed  path  set  A  A  f  1 ,  2,  •••,  L},  the  set  of  all  disjoint 
paths  to  the  destination.  Specifically,  minimize  the  functional  D(A)  = 
Ta  +  3^.  Then  obviously  the  value  of  the  functional  D(A)  will  not  be  the 
same  as  if  we  were  using  D(V),  but  if  the  optimal  values  of  both 
functionals  are  the  same,  this  approach  will  be  correct.  The  following 
proposition  summarizes  some  properties  of  this  optimization  approach. 

Proposition  4.6:  D  ,  the  optimal  value  of  the  effective  delay 
from  a  node,  minimized  with  respect  to  average  departure  frequencies 
and  departure  times  on  the  paths  to  the  destination  is  given  by: 

D*  =  D(A*)  £  min  D(A)  =  D(V*)  A  min  D(V)  (4.92) 

'p  J  fpj 

where  D(A)  is  defined  in  (4.  58),  V  in  (4.  59)  and  A  ^  fl,  2,  •  •  •  ,  L} . 

1  2  T 

Furthermore,  considering  D(A)  =  D(T  ,  T  ,  ••*,  T  )  as  a 
surface  defined  onlR^,  define  the  following  curves: 
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rr(Tk;  T1,  T2.  Tk'\  Tk+1,  •••,  TL) 

A  D(A)  11  Hk(T1,  •••,  Tk_1,  Tk+1,  TL)  (4.93) 

where 

rIC  =  {(xi»  x2*  *  “  *  xk-l*  ^Sc+l'  *“*  XL)  £  : 

x.  =  T^,  j  =  1,  2,  •••,  k-1,  k+1,  Lj  (4.94) 

is  a  hyperplane  in  1RL-1.  Then  n(Tk;  T1.  T2,  ••*,  Tk_1,  Tk+1,  *  *  * ,  TL) 

Ir 

is  a  function  of  T  which  has  one  stationary  point  and  at  most  one 

inflection  point  for  k  =  1,  •  •  • ,  L  and  for  all  ^  (C-')  *,  j  =  1,  2,  •  •  • , 

k-1,  k+1,  •••,  L,  where  is  the  channel  capacity  associated  with 
path  j.  Furthermore,  D(A)  is  a  convex  function  of  in  the  neighborhood 
of  its  minimum. 

Proof:  All  the  indices  run  from  1  to  L  except  those  noted  other- 

k*  If**  \r 

wise,  d  and  d  denote  first  and  second  derivatives  of  d  with  respect 
to  Tk. 


(4.95) 
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jig  ^ 

where  the  indices  i  e  [  1  ,  2  ,  •••,  k-1  ,  k,  k+1  ,  •••,  L  }.  Now 
since  dk'  >  0  V  Tk  >  (Ck)“X  from  Chapter  2,  3"  <  o  V  Tk  >  (Ck)-1. 

•  *  *  i/L  i  dX 

Therefore,  TK  =  ®  and  D(A  )  =  D(A  -  [k  })  from  (4.99). 

3$C 

Now  again  by  Proposition  4.  5,  3  m  ^  V  ,  meA  such  that 
D(A*  -  (m})  =  D(A*  -  Ck*}  -(tn})  <  dm,  and  Tm  =  ®  will  follow 
similarly  as  above. 

Therefore, 

A*  =  V*  U  f  j  :  T^  =  •}  (4.  102) 


which  gives  (4.  92)  by  (4.  98). 

k  1  2  k-1 

Next  consider  the  cross-section  curve tt(T  ;T  ,  T  ,  ••*,  T  , 
k+1  I 

T  ,  •  •  • ,  T  ).  At  an  inflection  point  of  tt: 


Since  d  <  0  and  d  >0  for  all  T  ,  and  since  the  left  hand  side  is 
positive,  in  order  for  (4.  103)  to  have  a  finite  solution,  we  must  have: 


D  (A  -  {k}) 


or 

dk  <  D(A  -  {k}) 


>  0, 


(4. 104) 

(4.105) 


at  the  inflection  point.  Furthermore,  at  the  inflection  point: 


drr 

rn  t1)  ra(A  - 

Ck}>  , 

1  1 

dTK 

t3 

J 

>C 


(dK  V 


(4.  106) 

2 

We  can  also  show  that  d  rr/dT  is  positive  to  the  left  of  the  inflection 
point  and  negative  to  the  right  of  it.  Therefore  we  have  two  possible 
shapes  for  tt  as  shown  in  Fig.  4.  8. 

To  see  that  there  is  a  neighborhood  of  D(A)  which  is  convex,  we 
can  write  the  second  order  derivatives  as: 

&D 


5  2D 


(n  tV 

M^k  1 


— E 
ST 


6Tk" 


y  n  t^ 
t  j*l 


(4.107) 


fi2D 


(  n  t‘)  (t 


i  ^m,  n 


m  SD 
m 


5T 


+  Tn^2_) 

5Tn  ' 


5Tm  5Tn 


7  n  T^ 


(4. 108) 


At  the  optimum  point,  the  first-order  partial  derivatives  will  be 
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zero  while  d  >  0;  therefore  the  Hessian  matrix  v  D (A)  will  be 

2 

diagonal  with  positive  elements;  hence  V  D(A)  >  0.  Due  to  continuity 
of  the  first  and  second  order  partial  derivatives  of  d-',  there  will  be  a 
neighborhood  of  the  optimum  point  where  the  Hessian  will  still  be 
positive,  making  the  functional  convex. 

Q.E.D. 

For  actual  implementation  of  an  optimization  algorithm,  writing 
D(A)  in  terms  of  frequencies  as: 

.  ♦ 

D(A)  =  - LLt -  (4.109) 

1  eA 

k*  k* 

will  be  more  appropriate,  since  f  =0  rather  than  T  =  00  for  k  i  V. 

Our  reason  for  working  with  T’s  was  to  interpret  D(A)  as  an  effective 
period  plus  an  effective  delay,  since  we  have  been  working  with  terms 
like  T  +  d  in  the  previous  sections  also. 

For  the  formulation  in  terms  of  frequencies,  the  possible  cross- 
section  curves  are  shown  in  Fig.  4.9. 

We  want  to  make  a  final  comment  on  the  minimization  of  the 
effective  delay  D  with  respect  to  TJ.  One  might  ask  whether  minimiza¬ 
tion  of  effective  delays  +  d^  on  paths  j  independently  for  all  paths 
leads  to  the  minimum  effective  delay  from  the  node.  The  fact  that  this 
superposition  does  not  hold  is  illustrated  by  the  following  example. 

Assume  that  there  are  two  disjoint  paths  to  the  destination  1  and  2. 

Let  T1*  A  min  (T1  +  d^T1))  and  T2*  A  min  (T2  +  d2(T2)).  For  the 
Tl  " 

values  T1*  =  1,  d^T1*)  =  6;  T2  =  2,  d2(T2")  =  5,  assume  that 
d2(T2  =  6)  =  2.  Then  DOT1  =  T1"  =  1,  T2  =  T2*  =  2)  =  6.33,  whereas 
D(T 1  =  T 1  =  1,  T2  =  6)  =  6.28.  Therefore  D(T 1 ",  T2",  •••,  T1  )  t  D*. 


TT(fk;  f1,  f2,  *•*,  fk_1,  fk+1,  *•*, 


Fig.  4.  9  Two  possible  shapes  for  the  n  curve  as 
a  function  of  fk. 
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4.  4.  4  Minimization  of  Worst-Case  Peak  Value  of  the 
Mean-Square  Error  for  the  General  Network 

For  a  network  with  a  single  sensor  node  and  several  links  to  the 

destination,  the  worst-case  peak  value  of  the  mean-square  error  is: 


trace  P(D) 


(4.  110) 


AP(t)  +  P(t)A'  +  BQB' 


(4. Ill) 


AP(0)  +  P(0)A'  +  BQB  -  P(0)C' R-1CP(0) 


1  k 

max  (t^  -  max  td)  V  j.k 


tk<tj 
a  a 


(4.  112) 
(4.  113) 


The  optimization  problem  is: 


min  pv 


(4. 114) 


where  (T-*)  1  is  the  average  reporting  frequency  on  link  j. 

Since  tr  P(D)  is  a  monotonically  increasing  function  of  Dv,  mini¬ 
mizing  pwc  (due  to  the  special  structure  of  matrix  A  -  see  Section  2.  3) 
with  respect  to  is  equivalent  to  minimizing  D  with  respect  to  T^, 
which  was  discussed  in  the  last  section. 

For  the  general  network  with  N  sensor  nodes,  p  is  given  by 

w  o 

(4.47)  through  (4.54),  where 

r  d..  (T..)  +  D. 

i  j.  \  il  n  l 


j  e  Q(i) 


(4.  115) 


j  e  an  V 


(T..)  is  the  average  departure  frequency,  d.j(T.  J  is  the  corresponding 


delay  on  link  (i,  j).  CXi)  is  the  set  of  nodes  for  which  link  (i,j)  exists. 

For  the  path  delay  in  (4.58),  we  have  substituted  d_(T.j)  +  D..,  which 

represents  the  worst-case  effective  delay  from  node  #i  to  the  destination 

over  link  (i,  j)  and  node  #j.  As  we  have  shown  in  the  last  section,  the 

above  expression  does  not  equal  the  effective  delay  from  node  #i  at  each 

iteration  of  an  algorithm  used  to  minimize  p  .  Rather,  its  value 

w  c 

equals  the  effective  delay  at  each  local  minimum  of  Pwc>  Remember 
that  the  effective  delay  reflects  the  optimum  adjustment  of  the  departure 
times  on  the  links  originating  from  a  sensor  node,  for  given  average 
departure  frequencies  and  corresponding  communication  delays. 

For  the  above  formulation,  D.  are  not  independent  as  was  the  case 
in  Section  4.  3;  therefore  they  cannot  be  minimized  independently.  On 
the  other  hand,  in  this  section  we  assume  that  matrix  A  can  be  any 
multivariable  square  real  matrix,  so  pwc  is  not  a  convex,  concave  or 
monotonic  function  of  D.  in  general,  only  a  continuous  function  of  them. 
The  behavior  of  pwc  as  a  function  of  D.  depends  entirely  on  the  system 
and  observation  model  with  the  statistical  characterizations  and,  as 
such,  it  is  decoupled  from  the  behavior  of  D.  as  a  function  of  T_,  which 
depends  entirely  on  the  communications  model  of  the  network.  There¬ 
fore,  it  is  not  in  general  possible  to  arrive  at  conclusions  about  the 

nature  of  p  as  a  function  of  T..,  i.  e.  whether  it  has  a  unique  minimum, 
rwc  ij 

etc. 

However,  it  can  be  shown  that  in  a  neighborhood  of  the  local 

minima,  p  is  a  convex  function  of  T...  This  follows  from  the  fact 
me 

that  5D./5T.,  is  finite  for  all  i,  j,  k  (can  be  shown  from  the  results  of 
the  last  section)  and  that$pwc/5D.  is  also  finite,  from  (4.47)  through 
(4.54)  and  assuming  that  the  matrices  P(0)  and  Q  are  finite.  ThorerVr* 
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in  these  convex  neighborhoods  of  the  local  minima,  we  can  use  some 
nonlinear  programming  algorithms  for  the  minimization  of  Pwc* 

Before  proceeding  further  though,  we  have  to  introduce  a  minor 
adjustment.  In  the  last  section  we  have  shown  that  for  paths  j  l  V, 

,*$C  . 

=  ®  or  f^  =  0.  Figure  4.  8  shows  that  as  -»  ®,  D  is  a  convex 

function  of  T^,  but  it  is  not  possible  to  run  a  numerical  algorithm  when 

the  optimum  point  is  at  the  infinity.  Alternatively,  as  Fig.  4.  9  shows, 

as  f**  -*  0,  D  is  a  concave  function  of  rendering  the  formulation  of  D 

in  terms  of  f.^'s  undesirable.  One  way  to  solve  this  problem  is  to  keep 

the  formulation  in  terms  of  T^'s,  and  to  first  find  the  links  (i,  j)  for 

which  =  »,  by  using  the  algorithm  of  Proposition  4.  5,  and  then 

execute  the  (bigger)  algorithm  for  the  remaining  variables  for  each 

iteration  of  the  (bigger)  algorithm. 

A  second  possible  approach  is  less  precise,  but  more  convenient. 

For  each  link  (I,  j),  we  assign  an  upper  limit  on  T.j,  call  it  U^,  and  say 

* 

that  the  optimum  value  of  is  ®  If  the  algorithm  gives  =  U.^. 

Useful  values  of  Ujj  may  be  taken  based  on  physical  considerations,  but 
can  be  readjusted  by  checking  with  the  algorithm  of  Proposition  4.  5. 

We  now  describe  the  algorithm,  which  is  based  on  a  class  of 
algorithms  introduced  in  [12]  and  [13]. 

Proposition  4,  7:  The  optimization  problem  is: 

min  p  (T)  (4.116) 

*  W  C  " 

c_i  spU 

where  T  is  the  vector  of  T^’s,  and  C-1  and  U  are  the  corresponding 
vectors  of  C./'s  and  U..'s,  respectively,  p  is  given  by  (4.47)  through 

lj  lj  wc 

(4.54)  and  (4.  115).  For  simplicity  of  notation,  the  elements  of  T  will 
12  - 1 

be  denoted  as  T  ,  T  ,  •  •  • ,  and  similarly  for  C  ,  U.  Subscripts  will 


denote  the  iterations. 

For  a  vector  Z  we  denote  by  [Z]#  the  vector  with  coordinates: 


izr 


U1  if  Zl  2  U1 


Zl  if  cl  1  <  z*  <  u1 


1  if  zl  <  c1 


(4.117) 


th 


The  k  iteration  is  as  follows: 

Step  1:  Find  the  set: 

lj  =  fi  Ic1  1  s  tJ,  i  C1  1  +  ef  and  ^.9."*—  >  o 

k  L  k  K 


or  IT 


4  *  TL  *  u‘ 


and  ‘Pwc<^  <  o}. 


6T 


(4.118) 


where 


i 


ek  =  min 


{'•  4) 


(4.119) 


where  e  >  0  is  a  constant  scalar  and 

i  I  rmi  r~l  ..I  6pwc(3k)  1# 


s, 


,i  f-i  (|I  *wc  -hc  ] 

k  ■  LTk  ■  J 


(4. 120) 


err 


and  /ij^  are  scalar  sequencies  such  that 


4  *  4  >  ° 


with  u*  >  0  a  constant  scalar. 


(4.121) 
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Step  2: 


Partition  as: 


(4.122) 


where  ^  is  the  vector  of  coordinates  with  i  e  1^  and  is  the  vector 
of  coordinates  with  i  i  l£.  Then  a  "search  direction". 


(4.123) 


is  obtained  by  solving  the  systems  of  equations: 


Sk  *k  -  ‘ 


(4.124) 


(4.  125) 


~  -  Fwc  -ik  T  A 

where  a.  (or  a.)  is  the  vector  with  coordinates - T -  with  i  e  I/ 

— K  — K  ypl  K 

JL  *** 

(respectively  i  i  U),  O.  is  a  diagonal  positive  definite  matrix  with 
b2pwc(3k) 

elements - -  *  ■  along  the  diagonal,  and  H.  is  a  symmetric  positive 

Wl)  k 

definite  matrix  which  is  equal  to  the  Hessian  of  p  with  respect  to  the 

Wv 

coordinates  T  ,  i  i  1^. 

Equation  (4. 125)  may  be  computationally  Impractical  to  solve 
exactly,  and  an  approximate  solution  by  the  following  scaled  version  of 
the  conjugate  gradient  method  may  be  used. 

Choose  a  positive  definite  symmetric  matrix  and  generate  the 
sequence  {Z^}  according  to  the  iteration: 


(4. 126) 


where  the  conjugate  direction  sequence  {u>m)  is  given  by: 


So 


Vo-  Urn  =  -Vm  +  VSm-1-  m  *  1-2- 


the  residual  sequence  {g^}  is  defined  by: 

=  n,  Z  +  a, , 
C-m  k — m  —4c 


(4.127) 


m  =  0,  1, 


(4.  128) 


and  the  scalars  y  and  B_  are  given  by: 
m  m 


o'  A  p 
s^m  k  £-m 


m  .  yr 

o*  **t, 

— m  k  — rn 


m  =  0,  1, 


(4.  129) 


ft 


m  =  1,  2, 


(4.  130) 


Terminate  at  an  iteration  m  if  the  residual  satisfies: 

IfioJ  s  ftk  Ifiol 


where  is  some  scalar  factor  less  than  unity  which  may  depend  on  the 

iteration  index  k. 

Then  use  77,  —  Z  . 

— k  — m 

Step  3:  Then 


T 

i-k+1 


•4 


“kV 


(4.132) 


mk 

where  ar^  *  8  ,  and  is  the  first  non-negative  integer  m  such  that: 


h 


•  »wc  Gic»m» 


c{-0m  l 


6pwc(5t’  ,.l 

4~^~  k 


»«*£  iT 


(4.  133) 


where 


T^a)  £  [Tk  +  a^f  V  a  2  0 


(4.134) 


and  0c  ( 0,1),  C  e  (0,  £). 


As  proved  In  the  papers  mentioned  above,  this  algorithm  has  a 


superlinear  convergent  rate,  and  with  the  conjugate  directions 


approximate  solution  in  step  2,  it  is  very  suitable  for  large-scale 


optimization  problems  such  as  the  case  at  hand. 


The  iterations  are  centralized,  and  as  such  they  suit  the  optim¬ 


ization  problem  under  consideration.  Calculation  of  the  partial 


derivatives  of  p  requires  knowledge  of  the  values  of  all  the  variables 
wc 


in  the  network  and  therefore  rules  out  decentralized  iterations. 


■  • 


CHAPTER  5 


EXTENSIONS 


5. 1  Introduction 

In  this  chapter  we  will  relax  some  of  the  restrictions  of  the 
model  assumed  In  Chapter  4.  However,  rather  than  paralleling  the 
analysis  given  there,  we  will  point  out  some  major  differences  in  the 
type  of  solutions  that  result.  In  Chapter  4  we  considered  a  network  of 
sensors  which  have  been  taking  measurements  for  a  long  time  so  that  the 
filtering  problem  at  each  node  has  reached  a  steady  state.  The  link 
delays  were  modelled  as  deterministic  functions  of  traffic  rates  on  the 
links  and  independent  of  the  traffic  on  other  links.  In  Section  5. 2  we  will 
consider  some  formulations  based  on  probabilistic  models  for  the  delays. 
In  Section  5. 3  we  will  demonstrate  the  effect  of  interdependent  delays 
with  an  example. 

5. 2  Formulations  Based  on  Probabilistic  Models  for  Delays 
5.2.1  Introduction 
In  Chapter  4,  we  assumed  the  message  delays  on  links  to  be  deter 

ministic  functions  of  the  traffic  rates.  Actually,  as  discussed  in 
Chapter  2,  the  delays  are  random  variables  and  we  approximated  their 
stochastic  behavior  with  their  mean  values. 

In  this  section,  we  will  consider  some  implications  of  the  probabil 
istic  models,  particularly  the  effect  of  the  reversal  of  the  order  of 
messages  from  the  same  source.  We  will  also  discuss  some  relevant 
objective  functionals  for  the  probabilistic  framework. 

We  will  assume  that  on  link  (I,  j)  node  #1  sends  messages  every  T 
seconds  to  node  #j,  and  that  the  delays  of  different  messages  are 
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independent  random  variables  with  the  same  distribution. 

5.2.2  Order  Reversal  of  Messages  and  Statistical 


Characterization  of  Delays 


First  let  us  consider  the  effect  of  the  order  reversal  of  messages. 
This  situation  happens  when  a  certain  message  A  arrives  later  than 
another  message  B  which  was  sent  after  message  A.  Assuming  that 
each  message  contains  sufficient  statistics  for  the  entire  observation 
history  of  the  source  node,  in  this  case  message  A  will  be  rendered 
useless,  since  message  B  carries  all  the  information  of  message  A  plus 
more.  Therefore  the  resulting  effect  will  be  as  if  message  A  was  lost, 
or  alternatively,  as  if  the  source  skipped  a  reporting  time. 

Figure  5. 1  illustrates  the  effect  of  an  order  reversal  on  the  mean- 
square  estimation  error.  The  m.  s.  e.  at  the  destination  node  is  plotted 
for  a  one  sensor  -  one  destination  network  configuration.  The  observed 
system  Is  modelled  a  Brownian  motion  process. 

In  this  example  the  message  sent  at  time  T  arrived  after  the  one 
sent  at  time  2T;  thus  it  was  useless  for  the  destination  node.  If  a 
message  arrives  before  all  the  messages  that  are  sent  later,  we  will  say 
that  that  message  has  "arrived".  Otherwise  we  will  say  that  it  is  "lost". 
The  occurrence  of  two  consecutive  lost  messages  will  be  called  a 
"double"  and  the  occurrence  of  n  consecutive  lost  messages  will  be 
called  an  "n- tuple". 

For  a  quantitative  analysis  of  the  order  reversal  effect,  we  will 
need  the  expected  frequencies  of  occurrence  of  "n-tuples".  This  is 
equal  to  the  probability  that  an  "n- tuple"  starts  with  a  message  which  is 
sent  at  some  tQ,  long  sifter  the  process  has  started.  In  other  words,  it 
is  the  probability  that  the  message  sent  at  t  -  T  "arrives",  the  messages 


Mean-square  estimation  error  vs  time  for  a  single 
sensor  configuration  with  stochastic  delays  and 
Brownian  motion  process  under  observation. 


sent  at  t  ,  t  +  T,  t  +  (n-l)T  are  "lost",  and  the  message  sent  at 
oo*o 

t  +  nT  "arrives"  again.  If  we  denote  the  delay  occurred  by  the  message 
sent  at  t  +  kT  by  cT.  ,  then  the  probability  of  this  event  is: 

O  a 

p  A  prob  ("n-tnple"  at  t  ) 
n  —  o 

=  Prob  <^k  +  (k+l)T  V  k  =  0,  1,  2,  ••• 

and  3  k  =  1,  2,  *  *  *  s.  t.  dQ  >  dk  +  kT 

and  3  k  =  2,  3,  •  •  •  s.  t.  d1  >  dfe  +  (k-l)T 


and  a  k  =  n,  n+1,  •••  s.  t.  >^k  +  (k-n+l)T 

and  3”n  < 3”k  +  (k+ 1  )T  V  k  *  n+1,  n+2,  •••]  (5.1) 


If  the  statistical  characterization  of  delays  is  given  by  a  probability 
density  function,  and  if  this  p.  d.f.  takes  on  nonzero  values  in  a  finite 
interval  only,  the  above  probability  can  be  calculated  in  terms  of  multiple 
integrals.  Alternatively,  rather  than  the  p.  d.  f.  for  the  delays,  the 
expected  frequencies  of  "n-tuples"  ,  p  ,  be  specified.  In  fact,  these 
probabilities  are  much  easier  to  estimate  than  the  p.  d.  f.  The  only  other 
statistic  that  we  need  for  further  analysis  is  the  mean  value  of  the  delays. 

5.2.3  Optimization  Criteria 

Worst-case  optimization  as  in  Chapter  4  is  not  as  meaningful  when 


the  delays  are  viewed  as  random  variables,  especially  if  the  p.  d.  f.  's  for 


delays  on  some  links  have  nonzero  values  over  an  infinite  domain. 


Therefore  the  natural  criteria  are  the  ones  which  are  based  on  mean 


values. 


On  possible  formulation  is: 


min  E  [p(t>]  (5.2) 

T 


Since  the  delays  are  random  variables,  the  mean-square  error,  p(t), 
becomes  a  stochastic  process.  For  a  one-sensor  node  -  one-destination 
node  configuration,  its  expected  value  is: 


E  p(t)  =  tr  S  + 


V 

L 

n 


n 


T+d 


1 

T 


$ 


k+n 


tr  P(t)  dt  (5.3) 


where 


0 

AS  +  SA'  +  BQB'  -  SC'  R-1CS 

(5.4) 

P(t)  = 

AP(t)  +  P(t)A*  +  BQB' 

(5.5) 

p(0)  = 

S 

(5.6) 

For  the  special  case  when  the  observed  system  is  modelled  as  a 
Wiener  process,  the  optimization  problem  has  a  particularly  simple 
form: 


min  E  (p(t)] 
T 


T 

min  k  ir 
T  “ 


+  md(T) 


(5.7) 


where 


k  =  1  +  £  (n2  -  n)pn 

n 


(5.  8) 


and  m^OT)  is  the  mean  value  of  the  delay  on  the  link  between  the  two 
nodes.  It  is  dependent  on  the  reporting  period  T,  because  the  probability 
distribution  of  the  delay  is  a  function  of  T. 

Another  possible  formulation,  which  is  simpler  than  (5.2)  is  to 
minimize  the  mean  of  the  peaks  of  the  p(t)  waveform.  Then  the  optimiz¬ 
ation  problem  for  a  1  sensor  node  -  1  destination  node  configuration 
becomes: 
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min  /  p  •  E  tr  P  (nT  +  3)  (5.  9) 

Ti-i  n 
n 

where  P(t)  is  given  by  (5.  4)  -  (5.6). 

For  these  formulations.  It  is  not  ready  to  find  closed  form 
expressions  for  the  objective  functions  for  a  general  multinode,  multi¬ 
link  network  configuration.  Instead,  we  propose  to  introduce  another 
approximation  in  order  to  develop  a  mathematically  more  tractable 
objective  function.  According  to  our  present  model,  the  sensor  nodes 
make  continuous  observations  of  a  stochastic  process  and  rep  Aheir 
sufficient  statistics  to  other  nodes  over  communication  links  discrete 
times.  Therefore  the  information  about  an  observation  mat  a 
particular  point  in  time  "waits"  at  this  node  until  the  next  reporting 
time;  plus  it  incurs  a  communication  delay  before  reaching  the  node  at 
the  other  end  of  the  link.  Likewise,  a  message  originating  at  some  node 
#1  and  routed  through  an  intermediate  node  #j  on  its  way  to  node  #k  has 
to  wait  at  node  #i  until  the  next  message  leaves  for  node  # k.  Naturally, 
it  too  incurs  a  communication  delay  on  link  (j,  k).  If  an  observation  is 
made  or  a  message  from  another  node  is  received  at  time  t,  let  6(t) 
denote  the  total  time  It  takes  for  the  information  about  it  to  reach  the 
other  node.  Figure  5.2  illustrates  5(t)  corresponding  to  the  delay 
samples  of  Fig.  5. 1. 

The  mean  value  of  5(t)  can  be  calculated: 

E  [$<t)]  =  [l  +  £  (n2-n)pj  [£  +  md(T)j  '5.10) 

n 

Now  the  approximation  proposed  is  to  model  the  messages  as 
being  sent  continuously  rather  than  periodically  and  to  assign  a  fixed 
delay  E[6(t)]  to  these  messages.  Obviously,  E[6(t)]  will  depend  on  the 


V* 
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actual  p.  d.f.  for  delays  on  that  link  and  the  actual  reporting  period  T. 
Under  this  new  model,  the  mean-square  estimation  error,  p(t),  will  be 
constant  in  the  steady  state,  and  the  natural  objective  function  will  be 
to  minimize  this.  With  this  new  model,  all  links  are  effectively 
assigned  a  "length",  and  the  optimization  procedures  discussed  in 
Chapter  4  are  equally  valid  for  this  new  set  of  lengths. 


5.  3  Interdependent  Delays 

We  will  present  a  simple  numerical  example  to  show  that,  when 
link  delays  are  dependent  on  traffic  on  other  links,  sensors  whose 
observations  are  too  noisy  may  have  to  be  shut  off;  or  links  whose 
traffic  has  a  very  adverse  effect  on  the  delays  of  other  links  may  have  to 
be  left  unused. 

Consider  the  configuration  of  Fig.  5.3,  where  there  are  two 
sensor  nodes  and  a  destination  node: 


Sensor 
(Node  #2) 


Fig.  5.  3  System  configuration  for  Section  5.  3. 


The  sensors  are  assumed  to  be  making  continuous  observations 
of  a  Wiener  process;  their  observation  noises  are  independent  Brownian 


motion  processes  with  intensities  ^  and  rg. 

dx(t)  =  dw(t) 

dy.(t)  =  x(t)dt  +  dvj(t),  i  =  1,  2  (5.11) 

We  assume  that  the  sensors  have  been  making  observations  for  a 
long  time,  so  that  the  local  filters  can  be  assumed  to  have  reached  the 
steady  state. 

The  sensor  nodes  continuously  send  sufficient  statistics  to  the 
destination  node. 

The  delays  on  links  (1, 3)  and  (2,  3)  are  Dj  and  Dg.  We  adopt  the 


following  model  for  the  delays: 

Di 

II 

+ 

a 

Ic  k 

d  .  k  t  I11  +  _12 

i  "  io 

(5.12) 

D2 

T2 

=  ~T  +  d2’ 

^21  ^22 
d2  =  k20  +  T^"  + 

(5.  13) 

This  model  can  be  interpreted  as  the  continuous  approximation  to 
discrete-time  communications  using  average  effective  delay  of  informa¬ 
tion.  The  delays  include  cross -coupling  terms  which  reflect  the  effect 
of  the  traffic  on  one  link  on  the  delay  of  the  other. 

Then,  if  Dj  >  D2,  the  mean-square  error  of  the  state  estimate  at 
the  destination  node  is: 


1  -  P„  e 


1  +  v 


■fatr2  * 


rl  r2 
rl  +  r2 


6  i  6 


7^7  - 

•/^2  +v,t'req 


(5.14) 


Let  us  assume  the  following  parameters  for  the  link  delay  functions: 


:io 

=  4, 

ku 

=  2, 

II 

w 

H 

20 

=  1, 

k2l 

* 

H 

• 

o 

II 

k22  = 

For  two  different  sets  of  noise  intensities,  we  get  the  following 
optimal  (starred)  values  for  the  periods,  delays  and  the  mean-square 
error: 


Case  1: 


T:  =  2.10 


40,  r2  =  400 


Dx  =  6.32 


2.47 


Case  2: 


q  ■  1. 


11.72 

16, 


T2  =  1.41 


p  =4.41 
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CHAPTER  6 


SUMMARY  AND  DIRECTIONS  FOR  FUTURE  RESEARCH 
6, 1  Summary 

In  this  thesis  we  considered  decentralized  linear  estimation 
problems  with  data-traffic-dependent  delay  constraints  arising  from  the 
assumed  underlying  communications  network  structure.  We  assumed 
that  the  system  model  under  observation  can  be  modelled  as  a  linear 
multivariable  system  driven  by  Brownian  motion;  that  the  sensors  in  the 
estimation  network  make  independent  noisy  observations;  and  that  the 
system  and  observation  models  are  such  that  the  estimation  problem  is 
in  a  "steady  state"* 

The  delays  on  the  communication  links  between  the  sensor  nodes 
were  considered  to  be  deterministic,  convex  and  monotonically  increasing 
functions  of  the  traffic  rates  In  the  network  for  much  of  the  thesis. 

The  linear  estimate  of  the  state  of  the  system  under  observation 
was  desired  at  the  so-called  "destination  node",  and  the  objective  was  to 
minimize  the  highest  value  the  mean-square  estimation  error  could  take 
over  time  at  this  node.  For  the  special  case  of  a  network  composed  of 
two  sensor  nodes  with  direct  connections  to  the  destination  node,  an 
algorithm  was  described  to  minimize  this  performance  criterion  as  a 
function  of  the  frequencies  of  the  periodic  messages  sent  by  the  sensors 
to  report  the  statistical  content  of  their  observations,  and  also  as  a 
function  of  the  time  or  phase  relationship  between  the  message  sequences 
of  the  two  nodes. 

When  the  network  structure  is  generalized  to  allow  an  arbitrary 
number  of  sensor  nodes  and  an  arbitrary  number  of  communication  links 
between  these  nodes,  messages  are  ordinarily  routed  over  intermediate 


nodes  on  their  way  from  a  source  node  to  the  destination  node.  However, 
unlike  the  usual  communication  networks  where  the  messages  must  be 
relayed  to  their  destination  intact,  we  have  shown  that  the  statistical 
content  of  the  messages  can  be  combined  at  the  intermediate  nodes  without 
any  loss  of  Information,  so  that  the  traffic  does  not  increase  in  the  down¬ 
stream  links  of  the  network  due  to  messages  originating  in  upstream 
nodes.  Furthermore,  sufficient  statistics  were  found  which  permit 
fusion  of  statistical  data  from  different  sources  with  no  loss  of  information 
without  having  to  transmit  the  entire  set  of  observations. 

For  the  general  network,  a  further  complication  arises  over  the 
two-sensor-node  example.  Routing,  or  message  frequencies  on  alternate 
paths  must  be  considered  as  an  additional  factor  in  optimization.  To 
alleviate  some  of  the  complexity,  a  form  of  worst-case  optimization 
policy  was  adopted  by  considering  the  situation  where  the  phase  relation¬ 
ships  between  the  message  sequences  on  the  links  lead  to  the  highest 
possible  mean-square  error  at  the  destination.  Thus,  the  phase  relation¬ 
ships  between  the  message  sequences  were  eliminated  from  the  set  of 
control  variables. 

We  have  pursued  two  types  of  worst-case  optimization  approaches 
for  the  general  network.  In  both  cases,  we  assumed  that  the  underlying 
communications  network  is  of  wire-network  type,  where  the  message 
delays  are  dependent  only  on  the  message  traffic  on  the  link  the  message 
is  travelling.  In  the  first  case,  we  allowed  any  node  in  the  network  to 
send  messages  to  only  one  other  node.  Under  this  restriction,  the 
optimization  problem  was  particularly  simple  and  resulted  in  a  spannlng- 
tree  type  of  routing  solution.  In  the  second  case,  we  allowed  the  nodes 
to  route  their  messages  to  the  destination  along  multiple  paths. 


Previously  unused  channel  capacity  was  thus  utilized  for  better  perform¬ 
ance.  Since  the  message  delays  on  separate  paths  from  a  sensor  node 
to  the  destination  node  are  in  general  different,  adjustment  of  the  precise 
departure  times  on  the  paths  for  average  departure  frequencies  and 
corresponding  delays  on  these  paths  was  another  issue  to  be  taken  into 
consideration. 

For  the  general  multi-path  case,  first  the  optimization  problem 
for  a  one-sensor-node  network  with  many  alternate  paths  to  the  destina¬ 
tion  was  analyzed.  The  correct  objective  function  was  derived,  minimiz¬ 
ing  which  gives  the  optimal  departure  rates  on  each  path,  taking  into 
consideration  the  optimal  adjustment  of  departure  times.  It  was  shown 
that  the  objective  function  is  convex  in  the  neighborhood  of  the  optimum 
point;  and  evidence  was  presented  that  the  only  stationary  point  is  the 
optimum. 

For  the  multi-path  case  with  arbitrary  number  of  nodes,  it  was 
argued  that  in  general  there  are  a  number  of  stationary  points.  A  super- 
linearly  convergent  nonlinear  programming  algorithm  was  described  for 
optimization  near  the  stationary  points. 

Until  this  point  the  message  delays  were  treated  as  deterministic 
functions  of  the  traffic  rate  on  the  particular  link  the  message  is  travelling 
In  the  last  chapter,  we  studied  some  implications  of  relaxing  this 
assumption  for  the  cases  of  random  message  delays  and  interdependent 
delays. 

For  the  random  delay  case,  an  approximate  problem  formulation 
based  on  average  delay  of  information  was  developed.  For  the  inter¬ 
dependent  delay  case,  it  was  shown  by  way  of  a  numerical  example  that 
sometimes  it  is  best  for  a  sensor  node  to  refrain  from  sending  any 
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messages,  since  the  adverse  effect  its  messages  have  on  the  delays  on 
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